1 a virus scanning engine using a parallel finite-input memory machine and mpus author: hiroki...

14
1 A Virus Scanning Engine Using a Parallel Finite-Input Memory Machine and MPUs Author: Hiroki Nakahara, Tsutomu Sasao, Munehiro Matsuura, and Yoshifumi Kawamura Publisher: The International Conference on Field Progr ammable Logic and Applications (FPL) 2009 Presenter: Han-Chen Chen Date: 2010/03/17

Post on 20-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

1

A Virus Scanning Engine Using a Parallel Finite-Input Memory Machine and MPUs

Author:Hiroki Nakahara, Tsutomu Sasao, Munehiro Matsuura, and Yoshifumi KawamuraPublisher:The International Conference on Field Programmable Logic and Applications (FPL) 2009Presenter:Han-Chen ChenDate:2010/03/17

2

Introduction This paper presents a virus scanning engine. The new architecture consists of a parallel finite-input

memory machine (PFIMM) and general purpose MPUs. It uses two stage matching. That is, in the first stage, the parallel hardware filter quickly scans the text to find partial matches, and in the second stage, the MPU scan the text to find the total match by bloom filter.

3

Two-stage Matchingpartial match length c = 2

4

Analysis for Virus Scanning

The profile analysis shows that the first stage spends 83% of the CPU time, while the second stage spends 17% of the CPU time. (on PC).

1. High-speed AC automaton by hardware to improve the first stage.

2. Reduction of the partial matches in the first stage (increase c) to reduce the load of the second stage.

5

FIMM (1/2) Since the state transitions for the AC automaton

is complex, the size of memory tends to be large.

For virus scanning system, the size of the circuit would be too large even if the bit partitioning method is used.

We use a finite-input memory machine (FIMM), it can eliminates the circuits for transition functions.

6

FIMM (2/2)

MFIMM = )1(2

log82 kc k is # of pattern

7

Bit Partitioned FIMM

We partition the FIMM into r units. Since each memory only scans 8c/r bits

out of 8c bits, non-stored patterns may be matched.

For each FIMM is encoded to 1-hot code to form a MV (Match Vector).

MrFIMM = 28c/r × k × r.

By expr. When r = 8 takes its minimun. M8FIMM = 2c × k × 8 = 2c+3 × k .

8

Compression of the Match Vectors

Pattern 1 : {68 4D}

Pattern 2 : {40 6D}

False positive : {48 6D} or {60 4D} also match

MMV = 2c × k × 8

MCMV = 2c × (k/m) × 8

9

Virus Scanning Engine

•PFIMM1024 use 128 units of 8_FIMMs.•The FIFO stores the pattern numbers of partial matches and the positions for the detected sub- patterns.• When the sub-pattern is detected, the FIFO sends an interrupt signal (IRQ) to the MPU. When the MPU accepts an IRQ, it scans the full text to check if it is a total match or not.

10

Experimental Results (1/4)

D(m) (characters) : the average interval of matches in the 8_FIMMT8FIMM : the matching time for one character on the 8_FIMMTMPU : the average matching time on the MPUn8FIMM : the number of 8_FIMMs.

Increasing the compression ratio m increases the false positive, so partial matches in the 8_FIMM also increases.As a result, the number of interrupts for the MPU increases. When the interrupt occurs during the matching operation in the MPU, the system suspends the 8 FIMM.

We obtained the maximum ratio m that does not suspend 8_FIMM :

11

Experimental Results (2/4)

When n8FIMM = 128, the number of patterns k be 512, the pattern length c be 8. FIMM1024 can store 65,536 ClamAV virus sub-patterns.

From Expr., the maximum ratio m that satisfies the condition is 16.

The total amount of memory for the 8_FIMM is 28 × 32 × 8 = 8,192 × 8 bits.The 8_FIMM of this size efficiently fits the embedded memory of the Altera FPGA (9 Kbits).

12

Experimental Results (3/4) First stage used the Altera FPGA StratixIII EP3SL340H11

52C3NE5 at 199.40 MHz. (consumes 75,826 ALUTs). Second stage used the embedded processor NiosII/f at 100.

00 MHz (consumes 1,478 ALUTs) and 1 GB DDR2-SDRAM.

Our virus scanning engine scans one character in every clock. Thus the throughput is 0.1994 × 8 = 1.595 Gbps.

The memory utilization coefficient (MUC) is

(8,192 × 128) / (65,536 × 8) = 2.000 Bytes/Char.

13

Experimental Results (4/4)

14

Thanks for your

listening