cp's OEIS Frontend

This is a front-end for the Online Encyclopedia of Integer Sequences, made by Christian Perfect. The idea is to provide OEIS entries in non-ancient HTML, and then to think about how they're presented visually. The source code is on GitHub.

A263129 Numbers whose binary representation is the concatenation of all the words in one of its possible Huffman encodings.

This page as a plain text file.
%I A263129 #51 Jan 26 2022 08:36:37
%S A263129 1024538,1024675,1024789,1024837,1024936,1025347,1025378,1025384,
%T A263129 1026593,10234987,10236597,10236758,10258346,10259347,10267845,
%U A263129 10278534,10283546,10293486,10293675,10294837
%N A263129 Numbers whose binary representation is the concatenation of all the words in one of its possible Huffman encodings.
%C A263129 This sequence is obtained using Huffman coding with decimal numbers as input.
%C A263129 Huffman coding is largely used for lossless data compression and with its algorithm we obtain "Huffman trees" whose branches generate sets of "0-1 words" (codewords), all different, each of them corresponding to a different input symbol.
%C A263129 The Huffman tree is constructed in such a way that the shortest codewords in the output sets match the more frequent symbol in the input (this is the point of Huffman compression).
%C A263129 Moreover, Huffman coding is structured for using codewords that avoid ambiguities (useful for decompression algorithm). E.g., if we have the word "01" in a set, we don't have "010" or "011" (or other longer words that start with "01") in the same set.
%C A263129 In our case, an output codeword set depends especially on the number of different digits and weight of each of them.
%C A263129 So if, for every number n, joining each codeword of one of its output codeword sets in the corresponding decimal digit position, we obtain a "bitstream" that matches to the same n input number binary representation, the number is added to the sequence.
%C A263129 Now, a brief analysis of the first few terms in the sequence.
%C A263129 Huf(Dec[6.0])=16 means "All 6-decimal-digit numbers, 0 repeated, correspond to 16 binary digits of a Huffman coding output". A wordset is, e.g., [00, 01, 100, 101, 110, 111].
%C A263129 But the lowest Dec[6.0] number is 102345, that is, the 17 binary digits 11000111111001001 (i.e., Bin(Dec[6.0])=17).
%C A263129 Also, Huf(Dec[6.(1,2)])=14 (this means "All 6 digits, 1 digit two times") and lowest Bin(Dec[6.(1,2)])=17. The other cases from [6.(2,2)] to [6.(1,6)] are all unbalanced. Thus, the sequence includes no numbers that are 1 to 6 digits in length.
%C A263129 Since with Huf(Dec[7.0])=20, and lowest Bin(Dec[7.0])=20, the first entry of this sequence is 1024538, that is the binary number 11111010001000011010 with the wordset [10, 111, 110, 011, 010, 001, 000].
%C A263129 All the other [7.x] cases are unbalanced, so there are only 9 numbers of 7 decimal digits (different or not) in the sequence.
%C A263129 For Huf(Dec[8.0])=24, we have only the codeword set [111, 110, 101, 011, 001, 010, 100, 000], that generates 297 valid n numbers that are in the sequence (see Program).
%C A263129 E.g., Huf(Dec[9.(1,2)])=27 using a set like [10, 110, 000, 001, 010, 011, 1110, 1111] where the word "10" is used two times in the corresponding decimal digit position, as in 132689520 (see Example).
%C A263129 Is not clear if, with the growth of the length (number of decimal digits) of decimal numbers, and the corresponding growth of the length (number of binary digits) of binary numbers, Huffman coding performs a better compression, and if this sequence is finite or not.
%D A263129 Francesco Di Matteo, Playful Sequences, Game Edizioni (forthcoming), pages 24-25.
%H A263129 Francesco Di Matteo, <a href="/A263129/b263129.txt">Table of n, a(n) for n = 1..306</a>
%H A263129 Slawek Ligus, <a href="http://huffman.ooz.ie/">Huffman tree generator</a>
%H A263129 Rosettacode, <a href="http://rosettacode.org/wiki/Huffman_coding">Huffman coding</a>, a large collection of Huffman coding programming routines.
%H A263129 Wikipedia, <a href="https://en.wikipedia.org/wiki/Huffman_coding">Huffman coding</a>
%e A263129 Decimal   Binary                 Huffman
%e A263129 1024538 = 11111010001000011010 = 111.110.10.001.000.011.010
%e A263129 1024675 = 11111010001010100011 = 111.110.100.010.101.00.011
%e A263129 ..
%e A263129 1026593 = 11111010101000100001 = 111.110.101.01.000.100.001
%e A263129 10234987 = 100111000010110001101011 = 100.111.000.010.110.001.101.011
%e A263129 10236597 = 100111000011001010110101 = 100.111.000.011.001.010.110.101
%e A263129 ..
%e A263129 132689520 = 111111010001010111001110000 = 1111.110.10.001.010.1110.011.10.000
%e A263129 ..
%e A263129 369752480 = 10110000010011111100110100000 = 101.100.0001.001.111.110.011.010.0000
%e A263129 etc.
%o A263129 (Python)
%o A263129 import itertools
%o A263129 decimal = []
%o A263129 huf = ['000','001','010','011','100','101','110','111']
%o A263129   # This huf list has been processed by a Huffman coding function.
%o A263129   # In order to find all the eight n different digits numbers, as I say
%o A263129   # in Comments, this is the only Huffman encoding valid wordset, so this
%o A263129   # is the simplest example routine.
%o A263129 ndec = list(map("".join, itertools.permutations('0123456789',len(huf))))
%o A263129 nbin = list(map("".join, itertools.permutations(huf)))
%o A263129 for item in nbin:
%o A263129   decimal.append(str(int(item, 2)))
%o A263129 n = set(decimal).intersection(set(ndec))
%o A263129 print(sorted(n), len(n))
%o A263129 # _Francesco Di Matteo_, Oct 18 2015
%Y A263129 Cf. A126014.
%K A263129 nonn,base,fini
%O A263129 1,1
%A A263129 _Francesco Di Matteo_, Oct 10 2015