Skip to content

csv / tsv parser thats shows types per key / shared keys (given multiple files) and serializes data to json

Notifications You must be signed in to change notification settings

arod1213/csvjson

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSVJSON

A command-line utility for serializing CSV/TSV files into readable JSON format.

Overview

CSVJSON converts comma-separated values (CSV) and tab-separated values (TSV) files into JSON format, providing flexible options for output formatting and data processing.

Features

  • Convert CSV/TSV files to JSON
  • Type definitions for JSON keys
  • Show shared keys given multiple files
  • Customizable delimiters for various file formats
  • Line-based processing with offset support
  • Multiple output formats (pretty-printed JSON or JSONL)

Flags

-h

Prints help info

-r

# read type options
all | key | type | field

Defaults to all

-r all

# prints all key value pairs for all lines in csv
cat input.csv | csvjson -r all

# prints first set of key value pairs
cat input.csv | csvjson -r all -l 1

# prints only the 50th set of key value pairs
cat input.csv | csvjson -r all -l 1 -o 50

-r type

# prints all possible types for each key value
cat input.csv | csvjson -r type

-r key

# finds all shared keys for all csvs in current directory
csvjson -r key -f ./*.csv

# finds all keys which at least 5 files share in common
csvjson -r key -f ./*.csv -l 5

# finds all keys
csvjson -r key -f ./*.csv -l 1

-r field

# prints all files that are missing the header "Title"
csvjson -r field -n "Title" -f ./*.csv

# prints all files that are missing the headers "Title" or "Artist"
csvjson -r field -n "Title" "Artist" -f ./*.csv

-s

Set the delimiter/separator for the input file.

Examples:

  • CSV: -s ','
  • TSV: -s $'\t'

bash

cat input.csv | csvjson -s ','
cat input.tsv | csvjson -s $'\t'

-l

Set the total number of lines to read from the input file.

bash

# read only first 100 lines
cat input.csv | csvjson -l=100  

-o

Set the offset for which line to start reading from.

bash

# skip first 10 lines
cat input.csv | csvjson -o=10

-m

Output JSONL (JSON Lines) format instead of pretty-printed JSON. Each line contains a separate JSON object.

bash

# print minimized jsonl
cat input.csv | csvjson -m

Usage Examples

Basic CSV conversion

bash

cat input.csv | csvjson

Convert TSV with type definitions

bash

input.tsv | csvjson -r type -s $'\t'

Process specific range with JSONL output

bash

# lines 50-150 as JSONL
cat data.csv | csvjson -o 50 -l 100 -m  

Output Formats

Pretty-printed JSON (default)

[
  {
    "name": "John Doe",
    "favorite_color": null,
    "languages": ["French", "English"],
    "age": 30,
    "email": "john@example.com"
  },
  {
    "name": "Jane Smith",
    "favorite_color": "blue",
    "languages": ["English"],
    "age": 25,
    "email": "jane@example.com"
  }
]

JSONL format (-m flag)

{"name":"John Doe","favorite_color":null,"languages":["French","English"],"age":30,"email":"john@example.com"}
{"name":"Jane Smith","favorite_color":"blue","languages":["English"],"age":25,"email":"jane@example.com"}

Pretty-printed JSON with types (-r types flag)

{
    "name": "String",
    "favorite_color": "Null | String",
    "languages": "Array of String",
    "age": "Int",
    "email": "String"
}

About

csv / tsv parser thats shows types per key / shared keys (given multiple files) and serializes data to json

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages