Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Evaluation of hash functions for multipoint sampling in IP networks

: Henke, C.
: Zseby, T.; Schmoll, C.; Wolisz, A.

Fulltext urn:nbn:de:0011-n-1049064 (1.3 MByte PDF)
MD5 Fingerprint: 8b2e1b51fd931549156cf8def52d0af4
Created on: 24.9.2009

Berlin, 2008, 89 pp.
Berlin, TU, Dipl.-Arb., 2008
Thesis, Electronic Publication
Fraunhofer FOKUS ()
hash-based sampling; passive multipoint measurements

Network Measurements play an essential role in operating and developing todays Internet. A variety of measurement applications demand for multipoint network measurements, e.g. service providers need to validate their delay guarantees from Service Level Agreements and network engineers have incentives to track where packets are changed, reordered, lost or delayed. Multipoint measurements create an immense amount of measurement data which demands for high resource measurement infrastructure. Data selection techniques, like sampling and filtering, provide efficient solutions for reducing resource consumption while still maintaining sufficient information about the metrics of interest. But not all selection techniques are suitable for multipoint measurements; only deterministic filtering allows a synchronized selection of packets at multiple observation points. Nevertheless a filter bases its selection decision on the packet content and hence is suspect to bias, i.e. the selected subset is not representative for the whole population. Hash-based selection is a filtering method that tries to emulate random selection in order to obtain a representative sample for accurate estimations of traffic characteristics. The subject of the thesis is to assess which hash function and which packet content should be used for hash-based selection to obtain a seemingly random and unbiased selection of packets. This thesis empirically analyzes 25 hash functions and different packet content combinations on their suitability for hash-based selection. Experiments are based on a collection of 7 real traffic groups from different networks.