compressing a single pdb

Compressing a Single PDB

Presented by: Danielle Sauer

CMPUT 652 Project

December 1, 2004

Outline

Problem Definition Key Background Approach Results Conclusion

Problem Definition

Motivation: What happens when a pattern database is too large to store in memory?

We can: Use several PDBs (and combine them into one). Compress individual PDBs.

My solution: Compress a single PDB.

Key Background

Pattern databases generally store two things: A state The state’s distance to goal.

The number of collisions are affected by: The hash function The size of the PDB

Approach

Overview Hash Functions Puzzle Types Domain Abstractions

Overview of Approach

Stores only the distance in the PDB. How to resolve collisions?

Given state ai already in entry E in the PDB.

State aj maps to entry E and collides with ai.

Take the minimum distance value of ai and aj

E = min(di, dj)

Lossy compression (throwing away values).

Hash Functions

Three hash functions Base 10 hash function Perfect hash function (permutation) Positional ordering hash function

Base 10 and Perfect Hash

Base 10 Hash

Perfect Hash Function Based on permutations No gaps in the hash table No collisions

Go through each entry in the puzzle (row by row).

Hashvalue = 102 345 678

Positional Ordering Hash

Ignore the nondistinct value with largest number of occurrences.

Position: 1 5 7 8 6

Tile #: 0 2 2 2 3

Hashvalue = 15786

Puzzle Types

8-puzzle from class Pancake Puzzle Topspin Physical-based sliding

tile puzzle

Domain Abstractions

1 “don’t care” symbol. Maps a tile to itself or maps it to the “don’t

care” symbol.

di(c) = c if c is an element of Gi blank if c = blank “don’t care” otherwise

Results

Expectation: As the size of the table becomes smaller, the number of nodes generated should become larger.

Reasoning: This method is lossy – we are throwing away heuristic values. The stored distance values will not be accurate

heuristics for some of the states.

Expected Results

Nodes Generated per PDB Size

PDB Size

Preliminary Results

Nodes Generated per PDB Size

0100200

300400500600

700800

8 10 11 14 16 18 20

PDB Size (2^n)

Hash 1

Hash 2

Hash 3

Summary

This method stores only the distance in the PDB.

It resolves collisions by storing the smallest distance of the colliding states.

Preliminary results suggest we can use a much smaller amount of memory and still get the same performance as a larger PDB.

compressing a single pdb

Documents

pdb editor manual

data compressing

2014 planning database (pdb)

debugging in python · debugging tools pdb :: python...

pdb parte1

validatordb: search by pdb

floor standing pdb

compressing social networks - clemson university

pdb and pymol

compressing hexahedral volume meshes

hsa 1bm0 .pdb

protein classification. pdb growth new pdb structures

compressing cheat sheet

couplings - pdb-media.leinelinde.se

pdb history structural biology...

analisis pdb

compressing powerpoint slides

1 azin nowrouzi, phd tums. 2 three-dimensional structures of...

power distribution board (pdb)

compressing relations and indexes