Hi, I have a problem where I have this big video file. I need to chunk them into blocks of 1024 bytes long. After that, I need to find SHA256 of each block. But the catch is that, I need to start from the last chunk. Append that hash with the block before that and find the combined hash and so on.. till we hit the first block and return the value of the one for the first block. I thought of using Lazy ByteString (Data.ByteString.Lazy.Char8 module). But laziness does not seem to be of an advantage here as we cannot compute the hash as we go. We need to hit the last chunk and work from there, recursion fits nicely here. Another approach is to do a chunking and store them as chunks in the list and then use foldr. I am wondering if there is an elegant approach to the problem. I haven't studied lens. Don't even know if it can be applied to this type of a problem. Any suggestions and help will be greatly appreciated. :) Thanks Ramakrishnan
I solved it with foldl, it turned out to be not too bad. But criticisms and improvements and reviews always welcome. Here is the code for the complete program: module Main where import System.IO import qualified Data.ByteString.Lazy as B import qualified Data.ByteString.Lazy as BL import qualified Data.ByteString.Lazy.Char8 as BLC import System.Environment (getArgs) import qualified Data.Digest.Pure.SHA as SHA chunkify :: Handle -> Int -> [BLC.ByteString] -> IO [BLC.ByteString] chunkify fileH size bxs = do ineof <- hIsEOF fileH if ineof then return bxs else do chunk <- BL.hGet fileH size chunkify fileH size (chunk:bxs) hashChain bxs = let (bx':bxs') = bxs hDigest = SHA.sha256 bx' h = SHA.bytestringDigest hDigest in foldl combine hDigest bxs' where combine chash chunk' = SHA.sha256 $ BLC.append chunk' $ SHA.bytestringDigest chash main = do args <- getArgs handle <- openFile (head args) ReadMode bxs <- chunkify handle 1024  putStrLn $ "number of chunks = " ++ (show (length bxs)) putStrLn $ "hash value = " ++ (show $ hashChain bxs)
---- On Thu, 30 Jan 2014 17:52:32 +0000 Ramakrishnan Muthukrishnan wrote ---- >I solved it with foldl, it turned out to be not too bad. But criticisms >and improvements and reviews always welcome. If you are using foldl, it might be a good idea to look at the enumerator/iteratee library. I have heard good stuff being said about it for high-performing foldl-based IO in #haskell. http://hackage.haskell.org/package/iteratee (hackage) http://www.tiresiaspress.us/haskell/iteratee/Examples/ (examples) >hashChain bxs = > let (bx':bxs') = bxs > hDigest = SHA.sha256 bx' > h = SHA.bytestringDigest hDigest > in > foldl combine hDigest bxs' > where combine chash chunk' = > SHA.sha256 $ BLC.append chunk' $ SHA.bytestringDigest chash >main = do > args <- getArgs > handle <- openFile (head args) ReadMode > bxs <- chunkify handle 1024  > putStrLn $ "number of chunks = " ++ (show (length bxs)) > putStrLn $ "hash value = " ++ (show $ hashChain bxs) > Am I missing something here? You are using left fold. That does this. foldl f a [b1,b2,b3...] is (f (f (f a b1) b2) b3) So, your chunking method is (sha256 (chunk10 ++ .. (sha256 (chunk4 ++ (sha256 (chunk3 ++ (sha256 (chunk2 ++ hash_chunk1))))))) But your initial question was for something that could only be solved by foldr. i.e. What you initially wanted was (sha256 (chunk1 ++ .. (sha256 chunk7 ++ (sha256 (chunk8 ++ (sha256 (chunk9 ++ hash_chunk10)))) Are you missing a reverse somewhere? or am I? Also, you could write your function as hashChain (bx:bxs) = bxs ... rather than using the let for this. Which again brings me to another question. What about hashChain ? While this is not recursive and may never happen, it is still good-practice to never have partially defined functions.