-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Compression: Huffman Coding #1513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
aladin002dz
wants to merge
8
commits into
TheAlgorithms:master
Choose a base branch
from
aladin002dz:Compression
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
90f1982
Huffman Compression Algorithm
aladin002dz 4e87b22
Huffman Compression Algorithm Comments
aladin002dz c4e98aa
Huffman Compression Algorithm Comments
aladin002dz bd83db9
Compression Huffman: Optimize algorithm
aladin002dz b02eae8
Compression Huffman: prettier
aladin002dz f951e51
Compression: Huffman coding using Heaps
aladin002dz 85df912
Prettier Style
aladin002dz b583f2e
Compression; removing unecessary logging
aladin002dz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
/** | ||
* Huffman Coding is a lossless data compression algorithm that uses variable-length codes to represent characters. | ||
* | ||
* The algorithm works by assigning shorter codes to characters that occur more frequently. This results in a compressed representation of the data. | ||
* | ||
* Huffman Coding is widely used in a variety of applications, including file compression, data transmission, and image processing. | ||
* | ||
* More information on Huffman Coding can be found here: https://en.wikipedia.org/wiki/Huffman_coding | ||
*/ | ||
|
||
/** | ||
* Builds a frequency table from a string. | ||
* @example | ||
* buildFrequencyTable('this is an example for huffman encoding') | ||
* returns { ' ': 6, a: 2, c: 1, d: 1, e: 4, f: 3, g: 1, h: 2, i: 3, l: 1, m: 1, n: 4, o: 1, p: 1, r: 1, s: 2, t: 2, u: 1, x: 1 } | ||
* @param {string} data - The string to build the frequency table from. | ||
* @returns {Object} - The frequency table. | ||
*/ | ||
function buildFrequencyTable(data) { | ||
const freqTable = {} | ||
|
||
for (const char of data) { | ||
freqTable[char] = (freqTable[char] || 0) + 1 | ||
} | ||
|
||
return freqTable | ||
} | ||
|
||
/** | ||
* A Huffman Node is a node in a Huffman tree. | ||
* @class HuffmanNode | ||
* @property {string} char - The character represented by the node. | ||
* @property {number} freq - The frequency of the character. | ||
* @property {HuffmanNode} left - The left child of the node. | ||
* @property {HuffmanNode} right - The right child of the node. | ||
*/ | ||
class HuffmanNode { | ||
constructor(char, freq) { | ||
this.char = char | ||
this.freq = freq | ||
this.left = null | ||
this.right = null | ||
} | ||
} | ||
|
||
/** | ||
* Builds a Huffman tree from a frequency table. | ||
* @param {Object} freqTable - The frequency table to use for building the tree. | ||
* @returns {HuffmanNode} - The root node of the Huffman tree. | ||
*/ | ||
function buildHuffmanTree(freqTable) { | ||
const nodes = Object.keys(freqTable).map( | ||
(char) => new HuffmanNode(char, freqTable[char]) | ||
) | ||
|
||
while (nodes.length > 1) { | ||
nodes.sort((a, b) => a.freq - b.freq) | ||
const left = nodes.shift() | ||
appgurueu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
const right = nodes.shift() | ||
const parent = new HuffmanNode(null, left.freq + right.freq) | ||
parent.left = left | ||
parent.right = right | ||
nodes.push(parent) | ||
} | ||
|
||
return nodes[0] | ||
} | ||
|
||
/** | ||
* Builds a Huffman code table from a Huffman tree. | ||
* @param {HuffmanNode} root - The root node of the Huffman tree. | ||
* @param {string} [prefix=''] - The prefix to use for the Huffman codes. | ||
* @param {Object} [codes={}] - The Huffman code table. | ||
* @returns {Object} - The Huffman code table. | ||
*/ | ||
function buildHuffmanCodes(root, prefix = '', codes = {}) { | ||
if (root) { | ||
if (root.char) { | ||
codes[root.char] = prefix | ||
} | ||
buildHuffmanCodes(root.left, prefix + '0', codes) | ||
buildHuffmanCodes(root.right, prefix + '1', codes) | ||
} | ||
return codes | ||
} | ||
|
||
/** | ||
* Encodes a string using Huffman Coding. | ||
* @param {string} data - The string to encode. | ||
* @param {Object} freqTable - The frequency table to use for encoding. | ||
* @returns {string} - The encoded string. | ||
*/ | ||
function encodeHuffman(data, freqTable) { | ||
const root = buildHuffmanTree(freqTable) | ||
const codes = buildHuffmanCodes(root) | ||
|
||
let encodedData = '' | ||
for (let char of data) { | ||
encodedData += codes[char] | ||
} | ||
|
||
return encodedData | ||
} | ||
|
||
/** | ||
* Decodes a string using Huffman Coding. | ||
* @param {string} encodedData - The string to decode. | ||
* @param {HuffmanNode} root - The root node of the Huffman tree. | ||
* @returns {string} - The decoded string. | ||
*/ | ||
function decodeHuffman(encodedData, root) { | ||
let decodedData = '' | ||
let currentNode = root | ||
for (let bit of encodedData) { | ||
if (bit === '0') { | ||
currentNode = currentNode.left | ||
} else { | ||
currentNode = currentNode.right | ||
} | ||
|
||
if (currentNode.char) { | ||
decodedData += currentNode.char | ||
currentNode = root | ||
} | ||
} | ||
|
||
return decodedData | ||
} | ||
|
||
// Example usage | ||
appgurueu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
const data = 'this is an example for huffman encoding' | ||
const freqTable = buildFrequencyTable(data) | ||
const root = buildHuffmanTree(freqTable) | ||
const encodedData = encodeHuffman(data, freqTable) | ||
console.log('Encoded Data:', encodedData) | ||
|
||
const decodedData = decodeHuffman(encodedData, root) | ||
console.log('Decoded Data:', decodedData) | ||
|
||
export { | ||
buildHuffmanCodes, | ||
buildHuffmanTree, | ||
encodeHuffman, | ||
decodeHuffman, | ||
buildFrequencyTable | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
import { | ||
buildHuffmanCodes, | ||
buildHuffmanTree, | ||
encodeHuffman, | ||
decodeHuffman, | ||
buildFrequencyTable | ||
} from '../Huffman' | ||
|
||
describe('Huffman Coding', () => { | ||
let data, freqTable, root | ||
|
||
beforeEach(() => { | ||
data = 'this is an example for huffman encoding' | ||
freqTable = buildFrequencyTable(data) | ||
root = buildHuffmanTree(freqTable) | ||
}) | ||
|
||
it('should encode and decode a string correctly', () => { | ||
const encodedData = encodeHuffman(data, freqTable) | ||
const decodedData = decodeHuffman(encodedData, root) | ||
|
||
expect(decodedData).toEqual(data) | ||
}) | ||
|
||
it('should build Huffman codes correctly', () => { | ||
const codes = buildHuffmanCodes(root) | ||
|
||
expect(codes['t']).toEqual('01010') | ||
expect(codes['h']).toEqual('11111') | ||
expect(codes['i']).toEqual('1001') | ||
expect(codes['s']).toEqual('0010') | ||
}) | ||
}) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,21 @@ | ||
import { evaluatePostfixExpression } from '../EvaluateExpression.js'; | ||
import { evaluatePostfixExpression } from '../EvaluateExpression.js' | ||
|
||
describe('evaluatePostfixExpression', () => { | ||
it('should evaluate a valid expression', () => { | ||
const expression = '3 4 * 2 / 5 +'; // (3 * 4) / 2 + 5 = 11 | ||
const result = evaluatePostfixExpression(expression); | ||
expect(result).toBe(11); | ||
}); | ||
const expression = '3 4 * 2 / 5 +' // (3 * 4) / 2 + 5 = 11 | ||
const result = evaluatePostfixExpression(expression) | ||
expect(result).toBe(11) | ||
}) | ||
|
||
it('should handle division by zero', () => { | ||
const expression = '3 0 /'; // Division by zero | ||
const result = evaluatePostfixExpression(expression); | ||
expect(result).toBe(null); | ||
}); | ||
const expression = '3 0 /' // Division by zero | ||
const result = evaluatePostfixExpression(expression) | ||
expect(result).toBe(null) | ||
}) | ||
|
||
it('should handle an invalid expression', () => { | ||
const expression = '3 * 4 2 / +'; // Invalid expression | ||
const result = evaluatePostfixExpression(expression); | ||
expect(result).toBe(null); | ||
}); | ||
|
||
}); | ||
const expression = '3 * 4 2 / +' // Invalid expression | ||
const result = evaluatePostfixExpression(expression) | ||
expect(result).toBe(null) | ||
}) | ||
}) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you sort in every iteration?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because when pushing a new node, the order may change.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be more appropriate to use a max heap here (our existing implementations should work, you just need to import and use them), given that you always extract the most frequent ones and only push new ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would add a lot more complexity, wouldn't it?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the API of our heap should be pretty straightforward to use. A heap is pretty much the data structure for this use case; it would significantly help the time complexity. If you want me to, I can make the necessary changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to implement it myself, then I'll need your valuable feedback and guidance.