A Day in the Life of the Internet is a large-scale data collection project undertaken by CAIDA and OARC every year since 2006. This year, the DITL collection will take place in April. If you would like to participate by collecting and contributing DNS packet captures, please subscribe to the DITL mailing list.
Participation RequirementsThere are no strict participation requirements. OARC is happy to accept data from members and non-members alike. If you are a non-member, you may want to sign a Proprietary Data Agreement with us, but this is not required. In terms of data sources, we are always interested in getting a lot of coverage from DNS Root servers, TLD servers, AS112 nodes, and "client-side" iterative/caching resolvers.
Types of DNS DataMost of the data that we collect for DITL will be pcap files (e.g., from dnscap or tcpdump). We are also happy to accept other data formats such as BIND query logs, text files, SQL database dumps, and so on. We have an established system for receiving compressed pcap files from contributors. If you want to contribute data in a different format, please contact us to make transfer arrangements.
- Please make sure that your collection hosts are time-synchronized with NTP. Do not simply use date to check a clock since you might be confused by time zone offsets. Instead use ntpdate like this:
$ ntpdate -q clock.isc.org server 18.104.22.168, stratum 1, offset 0.002891, delay 0.02713The reported offset should normally be very small (less than one second). If not, your clock is probably not synchronized with NTP.
- Be sure to do some "dry runs" before the actual collection time. This will obviously test your procedures and give you a sense of how much data you'll be collecting.
- Carefully consider your local storage options. Do you have enough local space to store all the DITL data? Or will you need to upload it as it is being collected? If you have enough space, perhaps you'll find it easier to collect first and upload after, rather than trying to manage both at the same time.
Collecting Data with dnscapIf you don't already have your own system for capturing DNS traffic, we recommend using dnscap with some shell scripts that we provide specifically for DITL collection.
- Download the most recent ditl-tools tarball. This includes a copy of the dnscap sources.
- You might need to edit dnscap/Makefile. Then run 'make' from the top-level ditl-tools directory.
- Note that dnscap no longer depends on libbind.
- Run 'make install' as root. This installs dnscap to /usr/local/bin.
- Copy settings.sh.default to settings.sh.
- Open settings.sh in a text editor.
- Set the IFACES variable to the names of your network interfaces carrying DNS data.
- Set the NODENAME variable (or leave it commented to use the output of `hostname` as the NODENAME). Please make sure that each instance of dnscap that you run has a unique $nodename!
- Set the OARC_MEMBER variable to your OARC-assigned name. Note that the scripts automatically prepend "oarc-" to the login name so just give the short version here.
- Note that the scripts assume your OARC ssh upload key is at /root/.ssh/oarc_id_dsa.
- Look over the remaining variables in settings.sh. Read the comments in capture-dnscap.sh to understand what all the variables mean.
When you're done customizing the settings, run capture-dnscap.sh as root:# Settings that you should customize # IFACES="fxp0" NODENAME="lgh" OARC_MEMBER="test"
When its time to do the actual DITL data collection, please uncomment the START_T and STOP_T variables in settings. sh and run the scripts from within a screen session.$ sudo sh capture-dnscap.sh
Collecting Data with tcpdump and tcpdump-splitAnother collection option is to use tcpdump and our tcpdump-split program. The instructions are similar to the above.
- Download and install the ditl-tools package (see link above).
- Copy settings.sh.default to settings.sh and bring it up in a text editor
- Set the IFACES variable to the single network interface to collect DNS data from.
- Set NODNAME
- Set OARC_MEMBER
- Tweak the BPF_FILTER variable as necessary.
Uncomment the START_T and STOP_T and use screen when its time for the real deal.$ sudo sh capture-tcpdump.sh