If our hlist is heterogeneous is nonempty, I think we don't need to write out a lot of these additional constraints. To elaborate, transformerLM, as it currently stands, require constraints such as:
transformerLM ::
forall
numAttnLayers nhead ffnDim paddingId vocabSize dmodel dhead seqLen ...
( HLast
( HReplicateR
(numAttnLayers)
( Tensor device dtype [batchSize, seqLen, dmodel],
DMap
(BlockCache device dtype batchSize seqLen dmodel dhead nhead)
Identity
)
)
( Tensor device dtype [batchSize, seqLen, dmodel],
DMap
(BlockCache device dtype batchSize seqLen dmodel dhead nhead)
Identity
),
HScanlC
(TransformerLayer dmodel nhead ffnDim dtype device)
(Tensor device dtype [batchSize, seqLen, dmodel])
( HReplicateR ...) -- abbrev
( HReplicateR ...), -- abbrev
HTail
( HReplicateR...) -- abbrev
( HReplicateR...) -- abbrev
) => ...
Conceptually, this is because when numAttnLayers is 0, the call should be rejected. For example, you can't get the last element (hLast) or get the tail (hTail) of an empty list. However, you don't know what Nat your user is going to instantiate numAttnLayers with, so you need to ensure that they provide a correct one.
Following this line of thought, it appears that additionally mandating 1 <= numAttnLayers is enough. However, we can see that this does not suffice already: We needed a HScanlC constraint even though hScanlC should work with lists of any length.
The bigger problem is that HReplicateR is non-injective, which means you can't pattern match on it.
Relating this all back to the issue title of "Nonempty HList" now, the reason why I want a nonempty HList is that a nonempty HList only has one constructor (aside: this is assumption is the crux), and does not need to align 0 with HNil and do that tricky compiler inferencing business.
If our hlist is heterogeneous is nonempty, I think we don't need to write out a lot of these additional constraints. To elaborate,
transformerLM, as it currently stands, require constraints such as:Conceptually, this is because when
numAttnLayersis 0, the call should be rejected. For example, you can't get the last element (hLast) or get the tail (hTail) of an empty list. However, you don't know whatNatyour user is going to instantiatenumAttnLayerswith, so you need to ensure that they provide a correct one.Following this line of thought, it appears that additionally mandating
1 <= numAttnLayersis enough. However, we can see that this does not suffice already: We needed aHScanlCconstraint even thoughhScanlCshould work with lists of any length.The bigger problem is that
HReplicateRis non-injective, which means you can't pattern match on it.Relating this all back to the issue title of "Nonempty HList" now, the reason why I want a nonempty HList is that a nonempty HList only has one constructor (aside: this is assumption is the crux), and does not need to align 0 with HNil and do that tricky compiler inferencing business.