Web Crawler CLI is a simple command line interface implementing the Frog Front Web Crawler Library.
The project is built using gradle. Once installed building the project is done with the following command.
$> gradle
...
BUILD SUCCESSFUL in 17s
6 actionable tasks: 6 executedFrom the command line the application can be run by invoking two different methods.
The first is to use gradle to run the project with its internal JavaExec command. This is invoked as follows.
$> gradle run --args='-f output.txt https://sample.com/'
Building report for https://sample.com/
processing of https://sample.com/ took 9 seconds
output file located at -> output.txt
The second method is to invoke it with the java -jar command. During the initial build Shadow Jar is invoked creating a Fat jar.
$> java -jar build/libs/web-crawler-cli-{version}.jar -f output.txt https://sample.com/
Building report for https://sample.com/
processing of https://sample.com/ took 9 seconds
output file located at -> output.txt
If you don't want to build locally from source you can use the runnable Docker image. Before doing so you will have to have Docker installed.
The following command will execute the latest docker image and write the output to your current directory.
$> docker run -e "crawl_url=https://sample.com/" -v $(pwd):/app/out cuzz22000/web-crawler-cliThe following is the build command for the docker image. It will install the latest .jar file located in you build/libs directory. You will have to substitute your docker repositoy name.
$> docker build -t ${your_repo}/web-crawler-cli:latest .- Implement a more robust CLI. Currently the arguments have to be ordered.. not nice!
Dockerize!! Runnning the application from a docker container and have it available via HUB would be pretty cool.