-
Notifications
You must be signed in to change notification settings - Fork 29
Open
Description
Hello. Thanks for you're kindness to share such a good project.
Could you please explain me why does we need
encode_memory
@staticmethod
def encode_memory(mems):
"""Encode memory to ids
<pad>: 0
<SEP>: 1
<unk>: 2
mem_id: mem_offset + 3
"""
ret = []
for mem in mems[: VocabEntry.MAX_MEM_LENGTH]:
if mem == "<SEP>":
ret.append(1)
elif mem > VocabEntry.MAX_STACK_SIZE:
ret.append(2)
else:
ret.append(3 + mem)
return retthis function, and why does it attempt to compare integers with '' ?
I can't get send of appending int token to array of int tokens instead of int token.
and second question is about this one:
def var_loc_in_func(loc):
print(" TODO: fix the magic number for computing vocabulary idx")
if isinstance(loc, Register):
return 1030 + self.vocab.regs[loc.name]
else:
from utils.vocab import VocabEntry
return (
3 + stack_start_pos - loc.offset
if stack_start_pos - loc.offset < VocabEntry.MAX_STACK_SIZE
else 2
)what and why is 1030 constant do?
And in general, why we define tokens as that:
self.word2id["<pad>"] = PAD_ID
self.word2id["<s>"] = 1
self.word2id["</s>"] = 2
self.word2id["<unk>"] = 3
self.word2id[SAME_VARIABLE_TOKEN] = 4
but using as this:
<pad>: 0
<SEP>: 1
<unk>: 2
mem_id: mem_offset + 3
Sorry, if my questions in too much, I specialize on system programming, and math with ML is a hobby.
Forward Thanks =)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels