For the second and third part you will need a root level access to Linux computer (or administrator access on Windows computer). If you do not have a computer suitable for that (e.g. if you only have a company laptop), please contact course staff and a loan computer can be arranged. A virtual computer will work for that purpose.
If you are not very familiar with network capture skills (TCPdump, Wireshark or tshark), you can
Begin by watching TCPdump introductory video and Wireshark introduction video for network capture.
View ELEC-E7130 Network capture tutorial to look through those commands and codes in detail.
Take a look at some code snippets which may give you some help.
At the end of this assignment, students should be able to
This assignment contains three tasks to introduce in more detail the traffic data that can be analysed for different tools. Please read all instructions before starting because it is helpful to identify common work.
To use some of the course-specific tools, some environment settings are needed in Aalto servers. Depending on your login shell, you need to run one of the following commands on school computer. The first command is used if you have any Bourne Shell compatible (like the Aalto default zsh or bash).
Note: You may type the command
kinit
before accessing the directory to avoid issues related to the permissions.
source /work/courses/unix/T/ELEC/E7130/general/use.sh
source /work/courses/unix/T/ELEC/E7130/general/use.csh
You need to provide the tool’s name and method (command line, if any) you have used to answer the above questions in your report file. We recommend that you try to use at least one command-line tool for analysis because, in a final assignment, the data volume is much larger.
You must answer the following points appropriately:
In this task, capture the traffic data from your computer. In the case of using a virtual machine (VM), generate traffic within that virtual computer instead of the usual host because it acts as a separate computer.
Choose one of the packet-capturing tools available such as dumpcap, Wireshark, tcpdump, etc.; to capture network traffic for one hour or more while using the computer as your normally do (browse web, check e-mails, watch video, listen music, do assignments, and so on).
Once you have the pcap file, use a tool (CoralReef, NetMate, tstat or program of your choice) to convert the pcap file into flows.
Once with flow data, answer the following points.
Plot the traffic volume (bytes) of the flow data file.
Note: Getting traffic volume is more difficult from flow data files due to the known information are only start time, end time, and flow size (bytes) (as shown in the figure). For example, if the flow contains 100,000 bytes starting at 3.4 and ending at 7.8, we can calculate that about 20,000 bytes for each second. See more information in Network capture tutorial (Traffic volume in certain interval, pp. 14).
Please provide the top 5 most commonly used protocols, as well as the five most common source ports and five most common destination ports based on flows. Detail in a table for each one
Hint: The column ‘pro’ defines the protocol used.
Which are the top-ten host pairs based on
Plot the number of flows for the 100 most common pairs of hosts
Repeat the previous plot (both linear and logarithmic scale) using this time fixed size (216 slots) array approach (Network capture tutorial - Large data analysis, pp. 8 and solution #2, pp. 10). What can you say about the results?
Is there a more efficient approach in terms of running time and memory consumption to accomplish this task?
Note: You can use
/bin/time
command to get resource consumption of a command, use-v
for more verbose. It provided a more detailed output than shell built-intime
.
Based on the traffic captured in Task 2, utilize an appropriate tool to analyze the captured data and provide answers to the following questions:
Note: Please choose one of the mass analysis tools to use such as shown in the Table 1. Mass analysis tools or another suitable tool (some packet-capturing software can also analyze for such a small amount of data, but it is better to practice the mass analyzer tool)
To pass this course, you need to achieve at least 15 points in this assignment. And if you submit the assignment late, you can get a maximum of 15 points.
You can get up to 30 points for this assignment:
Task 1
Task 2
Task 3
The quality of the report (bonus 2p)
For the assignment, your submission must contain (Please don’t contain original data in your submission):
Regarding the report, your report must have:
See more information in Network capture tutorial (Traffic volume in certain interval, pp. 14).