A Day in the Life of the Internet is a large-scale data collection project undertaken by CAIDA and OARC every year since 2006. In addition to the recently completed 2011 collection, DNS-OARC is sponsoring a IPv6-Day collection. If you would like to participate by collecting and contributing DNS packet captures, please subscribe to the DITL mailing list.
Participation Requirements
There are no strict participation requirements. OARC is happy to accept data from members and non-members alike. You will need a login from OARC to submit data and OARC will need your ssh public key. Contact OARC Admin if you need to setup or update your login or ssh keys. If you are not an OARC member, you may want to sign a Proprietary Data Agreement with us, but this is not required. In terms of data sources, we are always interested in getting a lot of coverage from DNS Root servers, TLD servers, AS112 nodes, and "client-side" iterative/caching resolvers.Types of DNS Data
Most of the data that we collect for DITL will be pcap files (e.g., from dnscap or tcpdump). We are also happy to accept other data formats such as BIND query logs, text files, SQL database dumps, and so on. We have an established system for receiving compressed pcap files from contributors. If you want to contribute data in a different format, please contact us to make transfer arrangements.Pre-collection Checklist
- Please make sure that your collection hosts are time-synchronized with NTP. Do not simply use date to check a clock since you might be confused by time zone offsets. Instead use ntpdate like this:
$ ntpdate -q clock.isc.org server 204.152.184.72, stratum 1, offset 0.002891, delay 0.02713
The reported offset should normally be very small (less than one second). If not, your clock is probably not synchronized with NTP. - Be sure to do some "dry runs" before the actual collection time. This will obviously test your procedures and give you a sense of how much data you'll be collecting.
- Carefully consider your local storage options. Do you have enough local space to store all the DITL data? Or will you need to upload it as it is being collected? If you have enough space, perhaps you'll find it easier to collect first and upload after, rather than trying to manage both at the same time.
Collecting Data with dnscap
If you don't already have your own system for capturing DNS traffic, we recommend using dnscap with some shell scripts that we provide specifically for DITL collection.- Download the most recent version of dnscap.
- Note that dnscap does not require libbind, unless you want to use the -x or -X options.
- Run ./configure, make and then 'make install' as root. This installs dnscap to /usr/local/bin.
- Copy settings.sh.default to settings.sh.
- Open settings.sh in a text editor.
- Set the IFACES variable to the names of your network interfaces carrying DNS data.
- Set the NODENAME variable (or leave it commented to use the output of `hostname` as the NODENAME). Please make sure that each instance of dnscap that you run has a unique $nodename!
- Set the OARC_MEMBER variable to your OARC-assigned name. Note that the scripts automatically prepend "oarc-" to the login name so just give the short version here.
- Note that the scripts assume your OARC ssh upload key is at /root/.ssh/oarc_id_dsa.
- Look over the remaining variables in settings.sh. Read the comments in capture-dnscap.sh to understand what all the variables mean.
When you're done customizing the settings, run capture-dnscap.sh as root:# Settings that you should customize # IFACES="fxp0" NODENAME="lgh" OARC_MEMBER="test" #START_T='2011-06-07 11:00:00' #STOP_T='2011-06-09 13:00:00'
When its time to do the actual DITL data collection, please uncomment the START_T and STOP_T variables in settings.sh and run the scripts from within a screen session.$ sudo sh capture-dnscap.sh
Collecting Data with tcpdump and tcpdump-split
Another collection option is to use tcpdump and our tcpdump-split program. The instructions are similar to the above.- Download and install the ditl-tools package (see link above).
- Copy settings.sh.default to settings.sh and bring it up in a text editor
- Set the IFACES variable to the single network interface to collect DNS data from.
- Set NODNAME
- Set OARC_MEMBER
- Set DESTINATIONS if desired
Uncomment the START_T and STOP_T and use screen when its time for the real deal.$ sudo sh capture-tcpdump.sh