Skip to content

Commit 9b3922f

Browse files
committed
Add MongoDB support for Geonames data import
- Implement MongoDB converters for postal codes and gazetteer data - Add new CLI options for MongoDB connection and import - Create geospatial indexes for efficient querying - Update README with MongoDB usage examples and documentation - Add MongoDB library as a suggested dependency in composer.json
1 parent b8a340a commit 9b3922f

6 files changed

Lines changed: 681 additions & 8 deletions

File tree

README.md

Lines changed: 131 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ A PHP library for downloading and converting Geonames data. This library provide
1111
- Download postal codes data for specific countries or all countries
1212
- Download detailed geographical data (Gazetteer) including administrative divisions
1313
- Convert data to JSON format with proper structure
14+
- Import data directly to MongoDB with proper indexing
1415
- Memory-efficient processing for large datasets
1516
- Progress bars for all operations
1617
- Support for filtering by feature types (for Gazetteer data)
@@ -39,7 +40,10 @@ Download postal codes for all countries:
3940

4041
Options:
4142
- `--output (-o)`: Output directory (default: ./data)
42-
- `--format (-f)`: Output format (default: json)
43+
- `--format (-f)`: Output format (default: json, options: json, mongodb)
44+
- `--mongodb-uri`: MongoDB connection URI (default: mongodb://localhost:27017)
45+
- `--mongodb-db`: MongoDB database name (default: geonames)
46+
- `--mongodb-collection`: MongoDB collection name (default: postal_codes)
4347

4448
The postal codes data includes:
4549
- Country code
@@ -65,8 +69,11 @@ Download geographical data for all countries:
6569

6670
Options:
6771
- `--output (-o)`: Output directory (default: ./data)
68-
- `--format (-f)`: Output format (default: json)
72+
- `--format (-f)`: Output format (default: json, options: json, mongodb)
6973
- `--feature-class (-c)`: Filter by feature class (default: P)
74+
- `--mongodb-uri`: MongoDB connection URI (default: mongodb://localhost:27017)
75+
- `--mongodb-db`: MongoDB database name (default: geonames)
76+
- `--mongodb-collection`: MongoDB collection name (default: gazetteer)
7077

7178
Available feature classes:
7279
- `A`: Country, state, region
@@ -93,7 +100,9 @@ The Gazetteer data includes:
93100

94101
## Data Structure
95102

96-
### Postal Codes JSON Structure
103+
### Postal Codes Structure
104+
105+
#### JSON Format
97106

98107
```json
99108
{
@@ -112,7 +121,40 @@ The Gazetteer data includes:
112121
}
113122
```
114123

115-
### Gazetteer JSON Structure
124+
#### MongoDB Format
125+
126+
In MongoDB, the postal codes data has the same structure as JSON but includes an additional `location` field for geospatial queries:
127+
128+
```json
129+
{
130+
"country_code": "TH",
131+
"postal_code": "10200",
132+
"place_name": "Bang Rak",
133+
"admin_name1": "Bangkok",
134+
"admin_code1": "10",
135+
"admin_name2": "",
136+
"admin_code2": "",
137+
"admin_name3": "",
138+
"admin_code3": "",
139+
"latitude": 13.7235,
140+
"longitude": 100.5147,
141+
"accuracy": 1,
142+
"location": {
143+
"type": "Point",
144+
"coordinates": [100.5147, 13.7235]
145+
}
146+
}
147+
```
148+
149+
The MongoDB collection is indexed for efficient queries:
150+
- Compound index on `country_code` and `postal_code` (unique)
151+
- Index on `country_code`
152+
- Index on `postal_code`
153+
- Geospatial index on `location`
154+
155+
### Gazetteer Structure
156+
157+
#### JSON Format
116158

117159
```json
118160
{
@@ -140,6 +182,91 @@ The Gazetteer data includes:
140182
}
141183
```
142184

185+
#### MongoDB Format
186+
187+
In MongoDB, the gazetteer data has the same structure as JSON but includes an additional `location` field for geospatial queries:
188+
189+
```json
190+
{
191+
"geoname_id": 1609350,
192+
"name": "Bangkok",
193+
"ascii_name": "Bangkok",
194+
"alternate_names": ["Krung Thep", "กรุงเทพมหานคร"],
195+
"latitude": 13.75,
196+
"longitude": 100.51667,
197+
"location": {
198+
"type": "Point",
199+
"coordinates": [100.51667, 13.75]
200+
},
201+
"feature_class": "P",
202+
"feature_code": "PPLC",
203+
"country_code": "TH",
204+
"cc2": [],
205+
"admin1_code": "40",
206+
"admin1_name": "Bangkok",
207+
"admin2_code": "",
208+
"admin2_name": "",
209+
"admin3_code": "",
210+
"admin4_code": "",
211+
"population": 5104476,
212+
"elevation": 2,
213+
"dem": 4,
214+
"timezone": "Asia/Bangkok",
215+
"modification_date": "2023-01-12"
216+
}
217+
```
218+
219+
The MongoDB collection is indexed for efficient queries:
220+
- Unique index on `geoname_id`
221+
- Index on `country_code`
222+
- Index on `feature_class`
223+
- Index on `feature_code`
224+
- Text index on `name` and `ascii_name`
225+
- Geospatial index on `location`
226+
227+
## MongoDB Usage Examples
228+
229+
### Finding locations near a point
230+
231+
```php
232+
$client = new MongoDB\Client('mongodb://localhost:27017');
233+
$collection = $client->geonames->gazetteer;
234+
235+
// Find all places within 5km of Bangkok
236+
$result = $collection->find([
237+
'location' => [
238+
'$near' => [
239+
'$geometry' => [
240+
'type' => 'Point',
241+
'coordinates' => [100.51667, 13.75] // [longitude, latitude]
242+
],
243+
'$maxDistance' => 5000 // 5km in meters
244+
]
245+
]
246+
]);
247+
248+
foreach ($result as $place) {
249+
echo $place['name'] . ' - ' . $place['feature_code'] . PHP_EOL;
250+
}
251+
```
252+
253+
### Finding postal codes by country
254+
255+
```php
256+
$client = new MongoDB\Client('mongodb://localhost:27017');
257+
$collection = $client->geonames->postal_codes;
258+
259+
// Find all postal codes in Bangkok, Thailand
260+
$result = $collection->find([
261+
'country_code' => 'TH',
262+
'admin_name1' => 'Bangkok'
263+
]);
264+
265+
foreach ($result as $postalCode) {
266+
echo $postalCode['postal_code'] . ' - ' . $postalCode['place_name'] . PHP_EOL;
267+
}
268+
```
269+
143270
## License
144271

145272
This package is open-sourced software licensed under the MIT license.

composer.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@
4949
},
5050
"suggest": {
5151
"guzzlehttp/guzzle": "Required to download the data from Geonames.",
52-
"ext-zip": "Required to extract the downloaded data."
52+
"ext-zip": "Required to extract the downloaded data.",
53+
"mongodb/mongodb": "Required for MongoDB output format support."
5354
},
5455
"minimum-stability": "dev",
5556
"prefer-stable": true,

src/Console/Commands/DownloadGazetteerCommand.php

Lines changed: 41 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
namespace Farzai\Geonames\Console\Commands;
66

77
use Farzai\Geonames\Converter\GazetteerConverter;
8+
use Farzai\Geonames\Converter\MongoDBGazetteerConverter;
89
use Farzai\Geonames\Downloader\GazetteerDownloader;
910
use Symfony\Component\Console\Command\Command;
1011
use Symfony\Component\Console\Input\InputArgument;
@@ -35,8 +36,11 @@ protected function configure(): void
3536
$this
3637
->addArgument('country', InputArgument::OPTIONAL, 'Country code (e.g., TH, US) or "all" for all countries')
3738
->addOption('output', 'o', InputOption::VALUE_REQUIRED, 'Output directory', getcwd().'/data')
38-
->addOption('format', 'f', InputOption::VALUE_REQUIRED, 'Output format (json)', 'json')
39-
->addOption('feature-class', 'c', InputOption::VALUE_REQUIRED, 'Filter by feature class (A,H,L,P,R,S,T,U,V)', 'P');
39+
->addOption('format', 'f', InputOption::VALUE_REQUIRED, 'Output format (json, mongodb)', 'json')
40+
->addOption('feature-class', 'c', InputOption::VALUE_REQUIRED, 'Filter by feature class (A,H,L,P,R,S,T,U,V)', 'P')
41+
->addOption('mongodb-uri', null, InputOption::VALUE_REQUIRED, 'MongoDB connection URI', 'mongodb://localhost:27017')
42+
->addOption('mongodb-db', null, InputOption::VALUE_REQUIRED, 'MongoDB database name', 'geonames')
43+
->addOption('mongodb-collection', null, InputOption::VALUE_REQUIRED, 'MongoDB collection name', 'gazetteer');
4044
}
4145

4246
protected function execute(InputInterface $input, OutputInterface $output): int
@@ -83,6 +87,41 @@ protected function execute(InputInterface $input, OutputInterface $output): int
8387

8488
$output->writeln('<info>Data has been downloaded and converted successfully!</info>');
8589
$output->writeln(sprintf('<info>Output file: %s</info>', $jsonFile));
90+
} elseif ($format === 'mongodb') {
91+
$output->writeln('<info>Converting to MongoDB format...</info>');
92+
93+
// Create MongoDB converter
94+
$mongodbUri = $input->getOption('mongodb-uri');
95+
$mongodbDb = $input->getOption('mongodb-db');
96+
$mongodbCollection = $input->getOption('mongodb-collection');
97+
98+
$mongoConverter = new MongoDBGazetteerConverter(
99+
$mongodbUri,
100+
$mongodbDb,
101+
$mongodbCollection
102+
);
103+
$mongoConverter->setOutput($output);
104+
105+
// Convert and import to MongoDB
106+
$jsonFile = str_replace('.zip', '.json', $zipFile); // Dummy file name, not used
107+
$mongoConverter->convert($zipFile, $jsonFile, $outputDir);
108+
109+
// Remove ZIP file after conversion
110+
unlink($zipFile);
111+
112+
// Remove admin code files
113+
if (file_exists($outputDir.'/admin1CodesASCII.txt')) {
114+
unlink($outputDir.'/admin1CodesASCII.txt');
115+
}
116+
if (file_exists($outputDir.'/admin2Codes.txt')) {
117+
unlink($outputDir.'/admin2Codes.txt');
118+
}
119+
120+
$output->writeln('<info>Data has been downloaded and imported to MongoDB successfully!</info>');
121+
$output->writeln(sprintf('<info>MongoDB: %s.%s</info>', $mongodbDb, $mongodbCollection));
122+
} else {
123+
$output->writeln(sprintf('<error>Unsupported format: %s</error>', $format));
124+
return Command::FAILURE;
86125
}
87126

88127
return Command::SUCCESS;

src/Console/Commands/DownloadPostalCodesCommand.php

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
namespace Farzai\Geonames\Console\Commands;
66

7+
use Farzai\Geonames\Converter\MongoDBPostalCodeConverter;
78
use Farzai\Geonames\Converter\PostalCodeConverter;
89
use Farzai\Geonames\Downloader\GeonamesDownloader;
910
use Symfony\Component\Console\Command\Command;
@@ -35,7 +36,10 @@ protected function configure(): void
3536
$this
3637
->addArgument('country', InputArgument::OPTIONAL, 'Country code (e.g., TH, US) or "all" for all countries')
3738
->addOption('output', 'o', InputOption::VALUE_REQUIRED, 'Output directory', getcwd().'/data')
38-
->addOption('format', 'f', InputOption::VALUE_REQUIRED, 'Output format (json)', 'json');
39+
->addOption('format', 'f', InputOption::VALUE_REQUIRED, 'Output format (json, mongodb)', 'json')
40+
->addOption('mongodb-uri', null, InputOption::VALUE_REQUIRED, 'MongoDB connection URI', 'mongodb://localhost:27017')
41+
->addOption('mongodb-db', null, InputOption::VALUE_REQUIRED, 'MongoDB database name', 'geonames')
42+
->addOption('mongodb-collection', null, InputOption::VALUE_REQUIRED, 'MongoDB collection name', 'postal_codes');
3943
}
4044

4145
protected function execute(InputInterface $input, OutputInterface $output): int
@@ -74,6 +78,33 @@ protected function execute(InputInterface $input, OutputInterface $output): int
7478

7579
$output->writeln('<info>Data has been downloaded and converted successfully!</info>');
7680
$output->writeln(sprintf('<info>Output file: %s</info>', $jsonFile));
81+
} elseif ($format === 'mongodb') {
82+
$output->writeln('<info>Converting to MongoDB format...</info>');
83+
84+
// Create MongoDB converter
85+
$mongodbUri = $input->getOption('mongodb-uri');
86+
$mongodbDb = $input->getOption('mongodb-db');
87+
$mongodbCollection = $input->getOption('mongodb-collection');
88+
89+
$mongoConverter = new MongoDBPostalCodeConverter(
90+
$mongodbUri,
91+
$mongodbDb,
92+
$mongodbCollection
93+
);
94+
$mongoConverter->setOutput($output);
95+
96+
// Convert and import to MongoDB
97+
$jsonFile = str_replace('.zip', '.json', $zipFile); // Dummy file name, not used
98+
$mongoConverter->convert($zipFile, $jsonFile);
99+
100+
// Remove ZIP file after conversion
101+
unlink($zipFile);
102+
103+
$output->writeln('<info>Data has been downloaded and imported to MongoDB successfully!</info>');
104+
$output->writeln(sprintf('<info>MongoDB: %s.%s</info>', $mongodbDb, $mongodbCollection));
105+
} else {
106+
$output->writeln(sprintf('<error>Unsupported format: %s</error>', $format));
107+
return Command::FAILURE;
77108
}
78109

79110
return Command::SUCCESS;

0 commit comments

Comments
 (0)