The assignment was to import the access logs for the EPA from 1995, restructure the data and provide a graphical analysis of the data. Because the description is currently not available under http://ita.ee.lbl.gov/html/contrib/EPA-HTTP.html, the raw file epa-http.txt is enclosed and downloadable here. The trace contains a day’s worth of all HTTP request to the EPA Webserver located at Research Triangle Park, NC. I’d like to acknowledge Dr. Laura Bottomley (laurab@ee.duke.edu) of the Duke University for the freely distribution of the logs, which had been used and analyzed in this project.
The logs were collected from 23:53:25 EDT on Tuesday, August 29, 1995, through 23:53:07 on Wednesday, August 30, 1995, a total of 24 hours. There were 47,748 total requests, 46,014 GET requests, 1,622 POST requests, 107 HEAD requests, and 6 invalid requests. Timestamps have one-second precision.
In the first step I wrote a script that imports the access log file and creates a new file that holds the log data, cleans it from uncommon characters and structure it as a JSON-Array. Secondly, I programmed the HTML and JavaScript files to read the JSON-File and render the following analysis graphically as charts:
Aside from that, it was essentially for me to pick the right chart to deliver a comprehensible explanation to the data. To set the focus on the charts I implemented a descent and responsive User Interface.