Skip to content

Latest commit

 

History

History
59 lines (41 loc) · 2.21 KB

File metadata and controls

59 lines (41 loc) · 2.21 KB

Web Crawler CLI

alt travis

Web Crawler CLI is a simple command line interface implementing the Frog Front Web Crawler Library.

Building

The project is built using gradle. Once installed building the project is done with the following command.

$> gradle
...
BUILD SUCCESSFUL in 17s
6 actionable tasks: 6 executed

Running

From the command line the application can be run by invoking two different methods.

The first is to use gradle to run the project with its internal JavaExec command. This is invoked as follows.

$> gradle run --args='-f output.txt https://sample.com/'

Building report for https://sample.com/
processing of https://sample.com/ took 9 seconds
output file located at -> output.txt

The second method is to invoke it with the java -jar command. During the initial build Shadow Jar is invoked creating a Fat jar.

$> java -jar build/libs/web-crawler-cli-{version}.jar -f output.txt https://sample.com/

Building report for https://sample.com/
processing of https://sample.com/ took 9 seconds
output file located at -> output.txt

Running in Docker

If you don't want to build locally from source you can use the runnable Docker image. Before doing so you will have to have Docker installed.

The following command will execute the latest docker image and write the output to your current directory.

$> docker run -e "crawl_url=https://sample.com/" -v $(pwd):/app/out cuzz22000/web-crawler-cli

Building Docker

The following is the build command for the docker image. It will install the latest .jar file located in you build/libs directory. You will have to substitute your docker repositoy name.

$> docker build -t ${your_repo}/web-crawler-cli:latest .

Future Plans

  • Implement a more robust CLI. Currently the arguments have to be ordered.. not nice!
  • Dockerize!! Runnning the application from a docker container and have it available via HUB would be pretty cool.