How do you split a number code in half?
For anyone who’s not familiar with how binary code works, I recommend looking it up, should take a minute or two to learn. Otherwise, you might get lost while reading this.
So, a while ago, during a lecture in one of my courses, we were learning about BCD (Binary Coded Decimal). In BCD, you convert every digit of a decimal number into 4 bits binary…
Ex:
0 = 0000
1 = 0001
2 = 0010
3 = 0011
...
18 = 0001 1000
19 = 0001 1001
20 = 0010 0000
etc.
Well, I’ve been working on a project where the save code is… like… very long (depending on the actual data being saved). Trying to think up of solutions, I figured out a way to shorten a long code by HALF while also being able to convert it back.
However, this method only works with number codes, and is great when it comes to working with files.
First, let’s use this number as an example:
2477123892
We’re going to convert every digit into 4 bits (a “nibble”).
Now, the highest decimal value we can reach with a nibble is 15 (1111
), so rather than converting every number from 0 to 9 into 4 bits, we can go from 0 to 15.
Note: In BCD, 4 bits are used to represent a number from 0 to 9, not 0 to 15, so the method I’m using here is different from BCD.
For this, though, we’ll go from 0 to 14 to keep 15 (“1111”) for another purpose.
Another issue is that if you replace, let’s say, every 12 with 1100, when you go to replace every 1 with 0001 or 0 with 0000, you might replace the new 0’s and 1’s from the binary code, and if you were to start with replacing the 0’s and 1’s first, it’ll affect the numbers between 10 and 14. So, to keep things less complicated, first convert every number from 14 to 0 with a letter, then that letter to a nibble.
14 = o = 1110
13 = n = 1101
12 = m = 1100
11 = l = 1011
10 = k = 1010
9 = j = 1001
8 = i = 1000
7 = h = 0111
6 = g = 0110
5 = f = 0101
4 = e = 0100
3 = d = 0011
2 = c = 0010
1 = b = 0001
0 = a = 0000
^ Note that you’ll need to start with 14, 13, 12, 11, & 10 before the other numbers
Now every single number is equal to 4 bits of 0’s and 1’s. Every 8 bits can be converted to a single ASCII value… so in other words, if the number we wanted to save was, let’s say, 51, we can convert it to BCD: 0101 0001, then combine these two nibbles into one byte… It’d be equal to a decimal value of 81, & using an ASCII table, we can see that’s equal to Q. We can reverse this process to turn Q back into 51.
As you can tell, “Q” is a shorter code than “51.”
Let’s go back to that number example I gave earlier and nearly forgot about:
2477123892
Convert every digit to a letter to avoid complicating things
cehhmdijc
Then every letter to 4 bits (I added spaces to help show this)
0010 0100 0111 0111 1100 0011 1000 1001 0010
And group every 2 groups of 4 bits together
0010 0100
0111 0111
1100 0011
1000 1001
0010
For the last group, you’ll need 4 more bits to complete it… This is where you add 1111 to complete the last byte:
0010 0100
0111 0111
1100 0011
1000 1001
0010 1111
^ Every line is one byte, I only added spaces to show how I grouped things
Lastly, by converting this binary code to ASCII, you’ll get this:
$wÃ/
^ Length went from 9 to 5, reduced by 44.44%
Code for everything I just explained so far
// converts string to 8bit binary code
window.stringToBinary=function(str) {
let binary = "";
for (let i = 0; i < str.length; i++) {
const charCode = str.charCodeAt(i).toString(2);
binary += "0".repeat(8 - charCode.length) + charCode;
}
return binary;
};
// converts 8bit binary code to string
window.binaryToString=function(binary) {
let str = "";
for (let i = 0; i < binary.length; i += 8) {
const byte = binary.substr(i, 8);
str += String.fromCharCode(parseInt(byte, 2));
}
return str;
};
// replaces all characters from arr1 with arr2 in string str
window.replaceA=function(str,arr1,arr2){
var a=str.toString(); // convert number to a string
arr1.forEach(function(value,index){
a=a.replaceAll(value,arr2[index]);
});
return a;
};
// converts number to code
window.numToCode = function(data){
var a=replaceA(
replaceA(data,['14','13','12','11','10','9','8','7','6','5','4','3','2','1','0'],
[ 'o', 'n', 'm', 'l', 'k','j','i','h','g','f','e','d','c','b','a']),
[ 'a' , 'b' , 'c' , 'd' , 'e' , 'f' , 'g' , 'h' , 'i' , 'j' , 'k' , 'l' , 'm' , 'n' , 'o' ],
['0000','0001','0010','0011','0100','0101','0110','0111','1000','1001','1010','1011','1100','1101','1110']
);
return binaryToString(a+((a.length)%8>0?'1111':''));
};
// converts code back to number
window.codeToNum = function(code){
var num=['0','1','2','3','4','5','6','7','8','9','10','11','12','13','14'];
var bi=['0000','0001','0010','0011','0100','0101','0110','0111','1000','1001','1010','1011','1100','1101','1110'];
r='';
var n=stringToBinary(code);
for (let i = 0; i < n.length; i += 4) {
const substr = n.substr(i, 4);
if(substr!=='1111'){
r=r+num[bi.indexOf(substr)];
}
}
return r;
};
function copyToClipboard(str) {
navigator.clipboard.writeText(str)
.then(() => {
console.log(`Copied "${str}" to clipboard.`);
})
.catch((error) => {
console.error(`Error copying "${str}" to clipboard: ${error}`);
});
}
var c=prompt("Here's the number you just entered as a code",numToCode(prompt('enter number')));
window.location="#"+c;
copyToClipboard(window.location.toString());
Here’s another example, a longer number:
57889712000399347987857929304985920398902230894853417782390102498948936793929891247862783647283871982000109231223120091021893019829
In ASCII, you can expect some unrecognizable characters:
WÀ9GxWY 9#
4x#¢IHgG'dr
<#À ¢
Honestly, I’d be lying if I was to say that I have done enough testing with this because I haven’t.
If the max byte is 11111111, which is… let’s see… (2^8)-1= 255… there are 255 different characters! Second thought, in our situation the highest byte would be 11101111, since we only add 1111 at the end to complete the bytes, which is 255-(2^4)=239 different characters!
Click here then look at this page’s url, you should see the ASCII code added after the #
Now click here and see the difference
Notice here some characters have been replaced with a percentage sign followed by a number.
This is known as percent-encoding. The number after the percentage sign is the hexadecimal value for that character.
If you plan to use the code in the link, then you should expect that to happen. Though if you plan to save the code into the local storage (which has a 5MB limit) or inside a file, then I’d recommend this method.
However, I should also note that the first 32 ASCII characters (0 to 31) are “control characters” that cannot be printed, and the 177th ASCII character can’t be printed either. We’ll have to find every one of these characters and replace them with something else.
You might be wondering (or maybe not), how in JavaScript are we supposed to have our code find characters that we can’t even type?
Well, you can add “\” with the octal value for any character to refer to it.
For example, if you were to run this code:
alert("\33")
You’d see, in very small font, the letters: “ESC” (which stands for “escape”)
… don’t ask why I used that as an example
You \30 type any character with this!
Btw, the 33 from the example above is the octal value for ESC.
Below is a table of all unprintable characters that we want to replace along with their octal value:
Octal Value | ASCII Control Character |
---|---|
000 | NUL (null) |
001 | SOH (start of heading) |
002 | STX (start of text) |
003 | ETX (end of text) |
004 | EOT (end of transmission) |
005 | ENQ (enquiry) |
006 | ACK (acknowledge) |
007 | BEL (bell) |
010 | BS (backspace) |
011 | HT (horizontal tab) |
012 | LF (line feed) |
013 | VT (vertical tab) |
014 | FF (form feed) |
015 | CR (carriage return) |
016 | SO (shift out) |
017 | SI (shift in) |
020 | DLE (data link escape) |
021 | DC1 (device control 1) |
022 | DC2 (device control 2) |
023 | DC3 (device control 3) |
024 | DC4 (device control 4) |
025 | NAK (negative acknowledge) |
026 | SYN (synchronous idle) |
027 | ETB (end of transmission block) |
030 | CAN (cancel) |
031 | EM (end of medium) |
032 | SUB (substitute) |
033 | ESC (escape) |
034 | FS (file separator) |
035 | GS (group separator) |
036 | RS (record separator) |
037 | US (unit separator) |
177 | DEL (delete) |
Next step is finding a replacement for these characters… it’s possible that we’re using all 239 characters depending on how long the data code is. So for this case, I’ll pick characters from another language… here’s an array of the control characters and their replacements:
var cc=[
{cc: "\0" , r:"١"} , {cc: "\1" , r:"٥"},
{cc: "\2" , r:"٢"} , {cc: "\3" , r:"٦"},
{cc: "\4" , r:"٣"} , {cc: "\5" , r:"٧"},
{cc: "\6" , r:"٤"} , {cc: "\7" , r:"٨"},
{cc:"\10" , r:"٩"} , {cc:"\11" , r:"ث"},
{cc:"\12" , r:"ا"} , {cc:"\13" , r:"ج"},
{cc:"\14" , r:"ب"} , {cc:"\15" , r:"ح"},
{cc:"\16" , r:"ت"} , {cc:"\17" , r:"خ"},
{cc:"\20" , r:"ض"} , {cc:"\21" , r:"ع"},
{cc:"\22" , r:"ص"} , {cc:"\23" , r:"غ"},
{cc:"\24" , r:"ق"} , {cc:"\25" , r:"ه"},
{cc:"\26" , r:"ف"} , {cc:"\27" , r:"ن"},
{cc:"\30" , r:"و"} , {cc:"\31" , r:"ز"},
{cc:"\32" , r:"ى"} , {cc:"\33" , r:"ط"},
{cc:"\34" , r:"ر"} , {cc:"\35" , r:"ك"},
{cc:"\36" , r:"ء"} , {cc:"\37" , r:"ظ"},
{cc:"\177", r:"م"}
];
When converting from code back to a number, you’ll just need to replace the Arabic characters back with the unprintable characters before converting to binary.
And I believe that’s… everything
Now… for the final test…
Final Test
0123456789 = ب4Vx
^ 10 to 5 characters, reduced by 50%
16151413121101010977712799022 = فهíËا©w|y"
^ 29 to 11 characters, reduced by 62.069%
83741934103894823819 = tز4£H#
^ 20 to 10 characters, reduced by 50%
10101010679247120375511129715982 = ªªgGÀ7U¼ه/
^ 32 to 13 characters, reduced by 59.375%
1145789266479065547822677744637899923411 = ءW&dy٤UG&wtF7#K
^ 40 to 19 characters, reduced by 52.5%
012 = ب
^ 3 to 1 characters, reduced by 66.66%
Now… after short a headache, I hereby present to you the code for this:
stop();
// converts string to 8bit binary code
window.stringToBinary=function(str) {
let binary = "";
for (let i = 0; i < str.length; i++) {
const charCode = str.charCodeAt(i).toString(2);
binary += "0".repeat(8 - charCode.length) + charCode;
}
return binary;
};
// converts 8bit binary code to string
window.binaryToString=function(binary) {
let str = "";
for (let i = 0; i < binary.length; i += 8) {
const byte = binary.substr(i, 8);
str += String.fromCharCode(parseInt(byte, 2));
}
return str;
};
// replaces all characters from arr1 with arr2 in string str
window.replaceA=function(str,arr1,arr2){
var a=str.toString(); // convert number to a string
arr1.forEach(function(value,index){
a=a.replaceAll(value,arr2[index]);
});
return a;
};
// Dealing with nonprintable characters
var cc=[
{cc: "\0" , r:"١"} , {cc: "\1" , r:"٥"},
{cc: "\2" , r:"٢"} , {cc: "\3" , r:"٦"},
{cc: "\4" , r:"٣"} , {cc: "\5" , r:"٧"},
{cc: "\6" , r:"٤"} , {cc: "\7" , r:"٨"},
{cc:"\10" , r:"٩"} , {cc:"\11" , r:"ث"},
{cc:"\12" , r:"ا"} , {cc:"\13" , r:"ج"},
{cc:"\14" , r:"ب"} , {cc:"\15" , r:"ح"},
{cc:"\16" , r:"ت"} , {cc:"\17" , r:"خ"},
{cc:"\20" , r:"ض"} , {cc:"\21" , r:"ع"},
{cc:"\22" , r:"ص"} , {cc:"\23" , r:"غ"},
{cc:"\24" , r:"ق"} , {cc:"\25" , r:"ه"},
{cc:"\26" , r:"ف"} , {cc:"\27" , r:"ن"},
{cc:"\30" , r:"و"} , {cc:"\31" , r:"ز"},
{cc:"\32" , r:"ى"} , {cc:"\33" , r:"ط"},
{cc:"\34" , r:"ر"} , {cc:"\35" , r:"ك"},
{cc:"\36" , r:"ء"} , {cc:"\37" , r:"ظ"},
{cc:"\177", r:"م"}
];
// converts number to code
window.numToCode = function(data){
var a=replaceA(
replaceA(data,['14','13','12','11','10','9','8','7','6','5','4','3','2','1','0'],
[ 'o', 'n', 'm', 'l', 'k','j','i','h','g','f','e','d','c','b','a']),
[ 'a' , 'b' , 'c' , 'd' , 'e' , 'f' , 'g' , 'h' , 'i' , 'j' , 'k' , 'l' , 'm' , 'n' , 'o' ],
['0000','0001','0010','0011','0100','0101','0110','0111','1000','1001','1010','1011','1100','1101','1110']
);
// BCD to ASCII String
var bs=binaryToString(a+((a.length)%8>0?'1111':''));
// Replacing non-printable characters
cc.forEach(function(v){
bs=bs.replaceAll(v.cc,v.r);
});
return bs;
};
// converts code back to number
window.codeToNum = function(code){
var num=['0','1','2','3','4','5','6','7','8','9','10','11','12','13','14'];
var bi=['0000','0001','0010','0011','0100','0101','0110','0111','1000','1001','1010','1011','1100','1101','1110'];
r='';
var c=code;
cc.forEach(function(v){
c=c.replaceAll(v.r,v.cc);
});
var n=stringToBinary(c);
for (let i = 0; i < n.length; i += 4) {
const substr = n.substr(i, 4);
if(substr!=='1111'){
r=r+num[bi.indexOf(substr)];
}
}
return r;
};
function copyToClipboard(str) {
navigator.clipboard.writeText(str)
.then(() => {
console.log(`Copied "${str}" to clipboard.`);
})
.catch((error) => {
console.error(`Error copying "${str}" to clipboard: ${error}`);
});
}
var test=prompt("Here's the number you just entered as a code",numToCode(prompt('enter number')));
window.location="#"+test;
copyToClipboard(window.location.toString());
alert(codeToNum(prompt("Write code to convert it to number",test)));
^ 105 lines total, first 100 are all the functions you need, last 5 are for testing purposes only.
^^ I feel like there’s a way to shorten my code as well, but I’m too lazy to try that
numToCode(number
) converts number to code
codeToNum(code
) converts the code back to a number
stringToBinary(str
) converts a string to binary code
binaryToString(binary
) converts binary code to a string
replaceA(str
, arr1
, arr2
) replaces elements from arr1 with those from arr2 inside string
copyToClipboard(str
) copies text to the clipboard, works with unidentifiable characters
Last five lines are only for testing purposes, you won’t need em’
I know that this works… but it feels incomplete… or maybe I did something wrong…? When working with binary and such I usually make mistakes. Was there an easier way of doing this…? In other words, I’m not really sure if this is complete, in that case, any thoughts or feedback are welcomed and I’ll turn this into a wiki to allow others to make edits.
I’ll have to go now and touch grass.
I don’t expect a lot of people to need this, only time you’d need it is when you have a save code that’s a number. Let me know your thoughts on this idea though