parsing - Making Attoparsec based parser more efficient -
i wrote simple text stl (standard tessellation library) parser using attoparsec. stl contains collection of facets. each facet contains normal, , vertices of triangle. typical stl file can large (~100 mb or more).
stl file format is
solid facet normal nx ny nz outer loop vertex x1 y1 z1 vertex x2 y2 z2 vertex x3 y3 z3 endfacet endsolid
the parser code here. full implementation can found @ stlreader
-- | point represents vertex of triangle data point = point !a !a !a deriving show -- | vector given class data vector = vector !a !a !a deriving show -- | parse coordinate triplet. coordinates :: (fractional a) => text -> (a -> -> -> b) -> parser b coordinates s f = skipspace string s !x <- coordinate !y <- coordinate !z <- coordinate return $! f x y z coordinate = skipwhile ishorizontalspace *> fmap realtofrac double {-# inline coordinates #-} type rawfacet = (vector a, point a, point a, point a) -- | parse facet. facet comprises of normal, , 3 vertices facet :: fractional => parser (rawfacet a) facet = (,,,) <$> beginfacet <* (skipspace *> "outer loop") <*> vertexpoint <*> vertexpoint <*> vertexpoint <* (skipspace <* "endloop" <* endfacet ) <?> "facet" beginfacet = skipspace <* "facet" *> coordinates "normal" vector endfacet = skipspace <* string "endfacet" vertexpoint = coordinates "vertex" point {-# inline facet #-} rawfacets :: fractional => parser [rawfacet a] rawfacets = beginsolid *> many' facet <* endsolid solidname = option "default" (skipwhile ishorizontalspace *> fmap t.pack (many1 $ satisfy isalphanum) ) beginsolid = skipspace <* "solid" *> solidname <?> "start solid" endsolid = skipspace <* "endsolid" <?> "end solid" -- | read text stl file. stl extensions color etc. not supported in version. readtextstl :: fractional => filepath -> io (either string [rawfacet a]) readtextstl path = liftm (al.eitherresult . al.parse rawfacets) (tio.readfile path) main :: io int main = (path:_) <- getargs putstrln $ "parsing stl file: " ++ path s <- readtextstl path putstrln "parsing complete" case s of left error -> putstrln error right s -> putstrln $ "num facets : " ++ show (length s) return 0
i benchmarked code 'c' parser supplied meshlab. when tested 69mb scan, meshlab completed job in 13 s, whereas took 22 s attoparsec. though attoparsec enabled me parse faster parsec (with parsec 36 s), still have long way go.
how can improve parser further?
Comments
Post a Comment