Working with jagged tensors is a real pain, especially when you have to translate back and forth from a fixed layout mentality to different jagged mentalities. Einops notation makes this easier. Einops support would make implementation easier too.
Example: sequences of sequences
I have data x of shape batch outer inner dim, or more precisely $[b, o_b, i_{b, o}, d]$ because the outer and inner sequence lengths vary. A nice way to do work with this sort of data in einops terms is to make a view of x as (b o) i d and get another tensor y of shape (b o) d, and then do x, y = inner_network(x, y); y = outer_network(y) back and forth.
This means viewing y as (b o) d to talk to the (b o) i d shaped x, and then viewing y as b o d to operate across the outer list level, and back and forth.
Juggling the offsets for this is a pain that makes x.reshape(2,-1,i,1).transpose(99, 42) look pleasant, while the einops model of b o i d -> (b o) i d and (b o) d -> b o d and b o d -> (b o) d is super clear.
Proposal
Einops already has pack and unpack which return a value and offsets. Let rearrange or a new repack or something take those offsets and a recipe, returning new offsets.
Implementation-wise, repack(ps, recipe) or repack(x, ps, recipe) is tedious, but fairly straightforward. I'll be doing it myself for pytorch but would love to put it in einops to share the benefit.
Bonus points
As a bonus, this model (offsets and values) is how arrow manages nested structures, so it would let einops users pass data to and from arrow without copying or being limited to fixed size columns.
Cheers and thanks for the great library :)
Working with jagged tensors is a real pain, especially when you have to translate back and forth from a fixed layout mentality to different jagged mentalities. Einops notation makes this easier. Einops support would make implementation easier too.
Example: sequences of sequences
I have data x of shape$[b, o_b, i_{b, o}, d]$ because the outer and inner sequence lengths vary. A nice way to do work with this sort of data in einops terms is to make a view of x as
batch outer inner dim, or more precisely(b o) i dand get another tensor y of shape(b o) d, and then dox, y = inner_network(x, y); y = outer_network(y)back and forth.This means viewing y as
(b o) dto talk to the(b o) i dshaped x, and then viewing y asb o dto operate across the outer list level, and back and forth.Juggling the offsets for this is a pain that makes
x.reshape(2,-1,i,1).transpose(99, 42)look pleasant, while the einops model ofb o i d -> (b o) i dand(b o) d -> b o dandb o d -> (b o) dis super clear.Proposal
Einops already has pack and unpack which return a value and offsets. Let
rearrangeor a newrepackor something take those offsets and a recipe, returning new offsets.Implementation-wise,
repack(ps, recipe)orrepack(x, ps, recipe)is tedious, but fairly straightforward. I'll be doing it myself for pytorch but would love to put it in einops to share the benefit.Bonus points
As a bonus, this model (offsets and values) is how arrow manages nested structures, so it would let einops users pass data to and from arrow without copying or being limited to fixed size columns.
Cheers and thanks for the great library :)