function Kraft_vector = unsorted_kraft( histogram ) % Kraft_vector = unsorted_kraft(histogram) % Uses the Huffman algorithm to generate a unsorted standard[*] % Kraft vector from a list of % probability (or number of occurrences) of each symbol. % Symbols that don't occur (any zeros in the histogram list) % are assigned a bogus length of 0 bits. % The Kraft vector is the length of a set of Huffman codewords. % % Usage: % plaintext = [7 7 7 7 4 4 3 3 2 1]; % histogram = histo(plaintext(:)); % Kraft_vector = unsorted_kraft( histogram ) % codebook = compactcodeP(Kraft_vector); % print_bitstrings( codebook ) % returns % Kraft_vector = [ 0 3 3 2 2 0 0 2 ]', % meaning that when one compresses that datavector, % "1" is compressed to 3 bits, % "7" is compressed to 2 bits, etc. % and then print_bitstrings returns % % there is no code for "0" % 000 % the 3 bit code for "1" % 001 % the 3 bit code for "2" % 01 % the 2 bit code for "3" % 10 % the 2 bit code for "4" % % there is no code for "5" % % there is no code for "6" % 11 % the 2 bit code for "7". % [*] Given a particular piece of plaintext, % often there are several other % Kraft vectors that create Huffman codes. % For the above example, Kraft_vector = [ 1, 2, 3, 4, 4 ] % and Kraft_vector = [ 1 3 3 3 3 ] % generates very different-looking Huffman codebooks. % (All Huffman codebooks compress that plaintext % into exactly the same number of bits). % % See also HUFFMAN_COMPRESS, HUFFMANLENGTH, FREQUENCY, % COMPACTCODEP, PREFIX_ENCODE, SORTED_KRAFT. % Change log: % 1999-07-03:DAV: David Cary started. sorted_Kraft_vector = sorted_kraft( histogram ); % Unfortunately, this is sorted most-frequent-first. % Now we need to % figure out what order the Kraft vector was sorted [frequencies, scramble_order] = sort(histogram(:)); scramble_order = flipUD(scramble_order); % and then unsort it: Kraft_vector = zeros(length(histogram),1); Kraft_vector(scramble_order( 1:length(sorted_Kraft_vector) )) = ... sorted_Kraft_vector; % Optional: check internal integrity. % calculated most-frequent-first frequencies = flipUD(frequencies); encodedlength1 = ... frequencies( 1:length(sorted_Kraft_vector) )' * sorted_Kraft_vector(:); % calculated in alphabetical order encodedlength2 = histogram(:)' * Kraft_vector; if( isequal( encodedlength1, encodedlength2 ) ), else, encodedlength1 encodedlength2 error('Sorry, internal error') end; if(0) % hufflen.m in huffman.zip % at % ftp://ftp.mathworks.com/pub/contrib/v5/misc/huffman/ % http://www.mathworks.com/ftp/miscv5.shtml % should do exactly the same thing as % unsorted_kraft.m % (perhaps it is faster). % Here's some code that's a factor of 2 faster, % but just a little more difficult to understand. %Subject: % Re: huffman code % Date: % 21 Dec 1998 00:00:00 GMT % From: % David Goodmanson % Organization: % AT&T WorldNet Services % To: % Russell % Newsgroups: % comp.soft-sys.matlab %... %The code below provides some measure of improvement. It's faster, and %the bit strings have trailing blanks but no embedded blanks. % %--Dave %------------------------------------------------------------------ % %function c = huffman(p) %HUFFMAN huffman code for vector of probabilities p. % Output is a string matrix with length(p) rows, % each containing a bit string followed by trailing blanks. % Larger probabilites are assigned '0', smaller are assigned '1'. % c = huffman(p) % 1998 David Goodmanson dgoodmanson@worldnet.att.net % 2725-68th Ave. S.E, Mercer Island, WA 98040 USA bits = '10'; % bit for smaller probability listed first p = p(:); xp = length(p); n = (1:xp)'; % nodes b = zeros(xp-1,2); % b(k-xp,:) are nodes branching from node k for k = xp+1:2*xp-1 [p1 j1] = min(p); p(j1) = nan; [p2 j2] = min(p); b(k-xp,:) = n([j1 j2])'; % minimum probability nodes n(j1) = k; % new node p(j1) = p1+p2; n(j2) = []; p(j2) = []; end c = char(0); c = c(ones(2*xp-1,1)); for k = 2*xp-1:-1:xp+1; anbits = c(k,:); % ancestor prefix bits anbits(anbits==char(0)) = []; c(b(k-xp,:),1:length(anbits)+1) = [[anbits; anbits] bits']; end c = c(1:xp,:); % keep original nodes only c(c==char(0)) = ' '; % note that Matlab zerofills string matrices with char(0) %---------------------------------------------------------------- %*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* % %David Goodmanson dgoodmanson@worldnet.att.net %909 4th Ave. N, Apt. D %Seattle, WA 98109 USA % % %-- %http://www.dejanews.com/[ST_rn=ps]/getdoc.xp?AN=424408620 end; % end unsorted_kraft.m