-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathreadme
More file actions
49 lines (28 loc) · 1.42 KB
/
readme
File metadata and controls
49 lines (28 loc) · 1.42 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
NOTE: All data stored in database 0.
To access, use the following commands:
>redis-cli
>keys *
Will print all the keys in our database
Parsing USPTO data:
use parse.py for parsing XML/text documents
syntax: python parse.py inputFileOrDirectory
Parses and stores the given file or directory
working example: python parse.py sample_data/small_sample.xml
Querying USPTO data:
use query.py for querying the redis USPTO database.
syntax: python query.py --Title="query values" --Description="query values" --IssueDate="yyyymmdd-yyyymmdd" --ApprovalDate="yyyymmdd-yyyymmdd"
Title: space seperated strings. Using AND symantex - in such that each value must be in the title.
Description: same as above
IssueDate: year range for the issue date min-max
ApprovalDate: same as above
all parameters are optional
working example: python query.py --Title="vehicle said" --ApprovalDate=19800101-19900101
*Note: The example only works if you've loaded data from the given date range.*
Bulk downloading USPTO data:
for downloading data in bulk from the google USPTO sources
syntax: python bulk_download.py yyyymmdd-yyyymmdd
Where the date range is from min-max.
*Note: Google has protection from bulk downloads. I've heard that they will actively ban IP addresses that hit their servers too hard.*
To then load this into redis, use parse.py with the directory 'data'
To wipe the database, just use query.py
Syntax: python query.py --WipeMe="true"