Ternary Huffman Encoding To Store DNA

Scientists are looking for new ways of efficient data storage and a method of interest is the use of DNA as a storage medium. With the right encoding, one cubic centimeter of DNA can store 10¹⁶ bits data, which means that you can store all the world's data in one pound DNA.

How does the program work

But, how can we achieve this? To do so we take the contents of the file and we encode it using huffman coding. In this huffman coding we will use trits(0,1,2) instead of bits. So, the huffman encoding is created throught the routes that connect the root to each and every leaf node of our concievable ternary tree. If our file has an odd amount of unique characters we add an imaginary character that has a frequency of zero. After coding the text to its ternary coding, we are gonna encode the ternary coding using the DNA bases, A (adenine), C (cytocine), G (guanine), T (thymine). We shall encode using the following table. By doing the opposite we can achieve the decoding of our already encoded file.

Previous base	Current trit
	0	1	2
A	C	G	T
C	G	T	A
G	T	A	C
T	A	C	G

How to run

In order to run this file from the terminal we have to give the following order

python dna_store.py [-d] input output huffman

Depending on the system where the program is run, you may need to write python3.

We can observe that the program receives four parameters:

If d is not given the program will code the input file to the output file using huffman and will save the huffman map to the given huffman csv. If d is given we will do the opposite and decoded the input file using the huffman.csv file we have given.
The parameter input represents either a coded input file or a normal text file depending on whether the parameter d is given.
The parameter output represents either the encoded output file or the decoded output file depending on whether the parameter d is given.
huffman represents a csv file that maps each character in its ternary code

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
ternary_huffman.py		ternary_huffman.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ternary Huffman Encoding To Store DNA

How does the program work

How to run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ternary Huffman Encoding To Store DNA

How does the program work

How to run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages