-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Labels
good first issueGood for newcomersGood for newcomers
Description
This was written in one shot so there is some repetition.
leftIndicesToGroup = M.elems $ M.filterWithKey (\k _ -> k `elem` cs) (D.columnIndices left)
leftRowRepresentations = VU.generate (fst (D.dimensions left)) (D.mkRowRep leftIndicesToGroup left)
-- key -> [index0, index1]
leftKeyCountsAndIndices = VU.foldr (\(i, v) acc -> M.insertWith (++) v [i] acc) M.empty (VU.indexed leftRowRepresentations)
-- key -> [index0, index1]
rightIndicesToGroup = M.elems $ M.filterWithKey (\k _ -> k `elem` cs) (D.columnIndices right)
rightRowRepresentations = VU.generate (fst (D.dimensions right)) (D.mkRowRep rightIndicesToGroup right)
rightKeyCountsAndIndices = VU.foldr (\(i, v) acc -> M.insertWith (++) v [i] acc) M.empty (VU.indexed rightRowRepresentations)
-- key -> [(left_indexes0, right_indexes1)]
mergedKeyCountsAndIndices = M.foldrWithKey (\k v m -> if k `M.member` rightKeyCountsAndIndices then M.insert k (VU.fromList v, VU.fromList (rightKeyCountsAndIndices M.! k)) m else m) M.empty leftKeyCountsAndIndices
-- [(ints, ints)]
leftAndRightIndicies = M.elems mergedKeyCountsAndIndices
-- [(ints, ints)] (expanded to n * m)
expandedIndices = map (\(l, r) -> (mconcat (replicate (VU.length r) l), mconcat (replicate (VU.length l) r))) leftAndRightIndicies
expandedLeftIndicies = mconcat (map fst expandedIndices)
expandedRightIndicies = mconcat (map snd expandedIndices)
-- df
expandedLeft = left { columns = VB.map (D.atIndicesStable expandedLeftIndicies) (D.columns left), dataframeDimensions = (VU.length expandedLeftIndicies, snd (D.dataframeDimensions left))}
-- df
expandedRight = right { columns = VB.map (D.atIndicesStable expandedRightIndicies) (D.columns right), dataframeDimensions = (VU.length expandedRightIndicies, snd (D.dataframeDimensions right))}
-- [string]
leftColumns = D.columnNames left
rightColumns = D.columnNames rightThe comments are also not very informative.
This should be broken into functions and tested.
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers