Yes, I did spend time on this - AOS for Lemmy.World - A generic Lemmy server for everyone to use.

500109

Yes, I did spend time on this

9mon 14d ago by feddit.org/u/cows_are_underrated in programmer_humor@programming.dev from feddit.org

Did you ever saw a char and thought: "Damn, 1 byte for a single char is pretty darn inefficient"? No? Well I did. So what I decided to do instead is to pack 5 chars, convert each char to a 2 digit integer and then concat those 5 2 digit ints together into one big unsigned int and boom, I saved 5 chars using only 4 instead of 5 bytes. The reason this works is, because one unsigned int is a ten digit long number and so I can save one char using 2 digits. In theory you could save 32 different chars using this technique (the first two digits of an unsigned int are 42 and if you dont want to account for a possible 0 in the beginning you end up with 32 chars). If you would decide to use all 10 digits you could save exactly 3 chars. Why should anyone do that? Idk. Is it way to much work to be useful? Yes. Was it funny? Yes.

Anyone whos interested in the code: Heres how I did it in C: https://pastebin.com/hDeHijX6

Yes I know, the code is probably bad, but I do not care. It was just a funny useless idea I had.

Did not knew that this existed, but yeah its kinda like that. Except that I only allow 5 characters.

25 bits.

Have you heard of Proquints?

https://www.ietf.org/archive/id/draft-rayner-proquint-03.html

This is hilarious. I'm not sure how often anyone would actually need to verbalize arbitrary binary data, but I do see an advantage over base64 since the English letter names are so often phonetically similar.

The FAA/ICAO use a similar system to name aerial navigation and fixed GPS waypoints. It addresses the challenge of communicating identifiers of unique nodes in a vast network using VHF communications.

Funny how they have a typo in test vectors:

0x0000 -> babab

0xFFFF -> zvzuz

0x1234 -> damuh

0xF00D -> zabat

0xBEEF -> ruroz

I did something like this once, in the course of a project whose purpose I don’t remember. Realising that 8-bit ASCII was wasted on the constrained alphabet of some kind of identifiers, I packed them into 6 bits, 1⅓ to a byte. I recall naming the code to do this “ShortASC”, pronounced “short-ass”

The AD&D "Gold Box" games from SSI Inc. stored game text in 6-bit encoding. The first one of these I played was "Champions of Krynn" and the PC release came on 4 360k 5.25 dsdd floppy disks. They actually needed the packing in those days, and couldn't afford to spent cpu cycles or ram on built in compression.
I remember opening up the game data files in a file viewer (maybe pc-tools?) and being confounded by the lack of text in the files.

You would have done well with this kind of thinking in the mid-80s when you needed to fit code and data into maybe 16k!

As long as you were happy to rewrite it in Z80 or 6502.

Another alternative is arithmetic encoding. For instance, if you only needed to store A-Z and space, you code those as 0-26, then multiply each char by 1, 27, 27^{2, 26}3 etc, the add them.

To unpack them, divide by 27 repeatedly, the remainder each time is each character. It's simply covering numbers to base-27.

It wouldn't make much difference from using 5 bits per char for a short run, though, but could be efficient for longer strings, or if encoding a smaller set of characters.

It sucks for me because my brain innately enjoys this kind of thinking

Play the Zachtronics programming games (Exapunks, Shenzen I/O, TIS-100) if you haven't yet.

After all.. Why not?

Why shouldn’t I ignore the 100+ cultures whose character set couldn’t fit into this encoding?

Not 100% relevant but it was in my collection and I thought it was close enough to be funny. :D

ŚĆŻRŹĘĄMŚ

So did ASCII

They left one bit for the other cultures use.

Yes, which is why we’ve broadly accepted that ASCII isn’t sufficient any more.

Åååååå!

Oh god that switch statement. Here, let me come up with something better:

if (pChar >= 'a' && pChar <= 'z') {
  return pChar - 'a' + 10;
} else if (pChar == ' ') {
  return 36;
} else if (pChar == '.'){
  return 37;
}
return 0;

Ah, thats cool. Did not knew you could that.. Thanks.

The original switch statement is great as it is. Easy to understand at a quick glance.

Yes, it's easy to understand, but this ifelse is much safer because it "handles" uppercase and digits by just returning 0. If you'd call OP's function with something like 'A', you'll get nonsense data because it doesn't have a default. (I think it will return whatever value currently resides in eax)

Can easily be dealt with by adding a default case. You also figured out this undefined behavior because you easily understood the switch statement.

Fair.

A single line comment would make it as easy to understand, and much more flexible if you wanted to add handling upper case letters or digits. Or even remap the values to a more standard 6 bit encoding.

Back in the day those tricks were common. Some PDP-11 OS's supported a "Radix-50" encoding (50 octal = 40 decimal) that packed 3 characters into a 16 bit word (40 codepoints=26 letters, 10 digits, and a few punctuation). So you could have a 6.3 filename in 3 words.

CPU still pulls a 32kb block from RAM...

Lol, using RAM like last century. We have enough L3 cache for a full linux desktop in cache. Git gud and don’t miss it (/s).

(As an aside, now I want to see a version of puppylinux running entirely in L3 cache)

I decided to take a look and my current CPU has the same L1 as my high school computer had total RAM. And the L3 is the same as the total for the machine I built in college. It should be possible to run a great desktop environment entirely in L3.

Look at this guy with their fancy RAM caches.

Cache man, its a fun thing. ~~32k~~ 32 (derp, 32 not 32k) is a common cache line size. Some compilers realise that your data might be hit often and aligns it to a cache line start to make its access fast and easy. So yes, it might allocate more memory than it should need, but then its to align the data to something like a cache line.
There is also a hardware reasons that might also be the case. I know the wii's main processor communicates with the co processor over memory locations that should be 32k aligned because of access speed, not only because of cache. Sometimes, more is less :')

Hell, might even be a cause of instruction speed that loading and handling 32k of data might be faster than a single byte :').

Then there is also the minimum heap allocation size that might factor in. Though a 32k minimum memory block seems... Excessive xD

Cache Man, I would watch that movie.

Cache lines are 64 bytes though? Pages are 4k.

Ye derp, im used to 32, not 32k lol.

Interesting idea but type conversion and parsing is much more slower than wasting 1 byte. Nowadays memory is "free" and the main issue is the execution speed.

Fuck it. *uses ulong to store a boolean*

So, python?

I know. This whole thing was never meant to be very useful, and more like a proof of concept

Alignment wastes much more anyways

I'm not sure if this is the right setting for technical discussion, but as a relative elder of computing I'd like to answer the question in the image earnestly. There's a few factors squeezing the practicality out of this for almost all applications: processor architectures (like all of them these days) make operating on packed characters take more operations than 8 bit characters so there's a speed tradeoff (especially considering cache and pipelining). Computers these days are built to handle extremely memory demanding video and 3d workloads and memory usage of text data is basically a blip in comparison. When it comes to actual storage and not in-memory representation, compression algorithms typically perform better than just packing each character into fewer bits. You'd need to be in a pretty specific niche for this technique to come in handy again, for better or for worse

This is 100% true. I never plan on actually using this. It might be useful for working on microcontrollers like an ESP32, but apart from that the trade of for more computational power is not worth the memory savings.

Having seen many of Kaze's videos on N64 development, I've learned that the N64 has like 4x the processing power it needs compared to its memory. In hardware cases like that the trade-off of computational power and memory memory savings gets you some nice performance gains.

I liked the technical discussion so thank you. Keep it up, I got into this career because there was always so much to learn.

There is still fun to be had! Just... Different fun!

In database land lookup tables are pretty common. Prefix tries and the like are super common in search land. I've seen GCD, offset, delta-of-delta, and some funky bitwise floating point compression used. Sometimes just to save dist space. But usually to save working set space or IO or S3 cache space.

And squeezing the most out of modern CPUs is its own art. Compilers are glorious. And modern CPUs are magic lightning rocks. But you can learn to sing to them just right to make them all happy.

Oh god, please don't. Just use utf8mb4 like a normal human being, and let the encoding issues finally die out (when microsoft kills code pages). If space is of consideration, just use compression, like gz or something.

You could save 0.64 bit per char more if you actually treated you output as a binary number (using 6 bits per char) and didn't go through the intermediary string (implicitly using base 100 at 6.64 bits per char).
This would also make your life easier by allowing bit manipulation to slice/move parts and reducing work for the processor because base 100 means integer divisions, and base 64 means bit shifts. If you want to go down the road of a "complicated" base use base 38 and get similar drawbacks as now, except only 5.25 bits per char.

I was so triggered by the conversion from char-to-int-to-string-to-packedint that I had to write a bitwise version that just does char-to-packedint (and back again), with bitwise operators.

https://pastebin.com/V2An9Xva

As others have pointed out, there are probably better options for doing this today in most real-life situations, but it might make sense on old low-spec systems if not for all the intermediate conversion steps, which is why I wrote this.

Not useless -- you have a future in tiny, embedded systems.

I do this kind of thing everyday as a firmware engineer :)

What do u write the firmware for?

It’s all fun and games until the requirement changes and you need to support uppercase letters and digits as well.

I am constantly on how I can allow Uppercase, without significantly reducing the posiible amounts of chars

Well it’s certainly possible to fit both uppercase and lowercase + 11 additional characters inside an int (26 + 26 + 11 = 63). The you need a null terminating char, which adds up to 64, which is 6 bits.

So all you need is 6 bits per char. 6 * 5 = 30, which is less than 32.

It’s easier to do this by thinking in binary rather than decimals. Look into bit shifting and other bitwise operations.

Depending on the use-case you might also want to add special case value like @Redkey@programming.dev did in their example, and get kind of UTF-8 pages. Then you can pack lowercase to 5 bits, and uppercase and some special symbols to 10 bits, and it will be smaller if uppercase are rare

If you're ever doing optimizations like this, always think in binary. You're causing yourself more trouble by thinking in decimal. With n bits you can represent 2^n different results. Using this you can figure out how many bits you need to store however many different potential characters. 26 letters can be stored in 5 bits, with 6 extra possibilities. 52 letters can be stored in 6 bits, with 12 extra possibilities. Assuming you want an end of string character, you have 11 more options.

If you want optimal packing, you could pack this into 48 bits, or 6 bytes/chars, for 8 characters.

Does the efficiency of storage actually matter? Are you working on a constrained system like a microcontroller? Because if you’re working on regular software, supporting Unicode is waaaaaaaaaaay more valuable than 20% smaller text storage.

Unicode? Sir this is C, if the character doesn't fit into a uint8 it's scope creep and too hard

I do sometimes work with microcontrollers, but so far I have not encountered a condition where these minimal savings could ever be useful.

Not that long ago I was working on TI's C2000 DSPs. They have a 16 bit word size, which would make this a convenient way to pack three characters into a single word. Ther ewould then be a single leftover bit which could be used for signaling, parity, or something else.

unsigned int turn_char_to_int(char pChar)
{
    switch(pChar)
    {
    case 'a':
        return 10;
    case 'b':
        return 11;
    case 'c':
        return 12;
    case 'd':
        return 13;
    case 'e':
        return 14;
    case 'f':
        return 15;
    case 'g':
        return 16;
    case 'h':
        return 17;
    case 'i':
        return 18;
    case 'j':
        return 19;
    case 'k':
        return 20;
    case 'l':
        return 21;
    case 'm':
        return 22;
    case 'n':
        return 23;
    case 'o':
        return 24;
    case 'p':
        return 25;
    case 'q':
        return 26;
    case 'r':
        return 27;
    case 's':
        return 28;
    case 't':
        return 29;
    case 'u':
        return 30;
    case 'v':
        return 31;
    case 'w':
        return 32;
    case 'x':
        return 33;
    case 'y':
        return 34;
    case 'z':
        return 35;
    case ' ':
        return 36;
    case '.':
        return 37;

    }
}

Are you a monster or just stupid?

Just stupid

If you couldn't write

if(pChar >= 'a' && pChar <= 'z') return pChar - ('a' - 10);

I suppose you typed this "all the size of a lookup table with none of the speed" abomination manually too.

switch case structures are very efficient in c and c++. They work similarly like an offset into memory. Compute the offset once (any expression in the 'case' lines), then jump. Using primitives directly, like here with chars, is directly the offset. Contrary to if-else branches, where each case must be evaluated first and the CPU has basically no-op cycles in the pipeline until the result of the branch is known. If it fails, it proceeds with the next one, waits again etc.. (Some CPU architectures might have stuff like speculative branch execution, which can speed this up.)

However, code-style wise this is really not elegant and something like your proposal or similar would be much better.

Oh, I didn't know that they were a LUT of jump addresses. Stil, a LUT of values would be more space-efficient and likely faster. Also, what if the values are big and sparse, e.g.

switch (banknoteValue) {
    case 5000:
        check_uv();
        check_holograph();
    case 2000:
        check_stripe();
    case 1000:
        check_watermark();
}

...does the compiler make it into an if-else-like machine code instead?

Good question! I have read a bit more about it and this does indeed heavily depend on the respective compiler implementation. Depending on the compiler, it may prefer default if-else ladders for few cases. For more, dense cases, LUTs may play a larger role. For less dense cases binary search might be chosen.

I inspected the generated assembler code of your (slightly extended) example via https://godbolt.org/

The code I used:

void check_uv();
void check_holograph();
void check_stripe();
void check_watermark();

void switch_test(int banknoteValue) {

    switch (banknoteValue) {
        case 5000:
            check_uv();
            check_holograph();
        case 2000:
            check_stripe();
        case 1000:
            check_watermark();
    }

}

Using x86-64 gcc 15.2 this leads to a couple of cmp followed by je instructions, so "compare" and "jump to label if equal" which basically is a typical if-else ladder. I get the same for x64 msvc v19.43.
Changing the cases to 1, 2 and 3 instead of 5000, 2000 and 1000 does not change the outcome.

Increasing to 23 different but incrementally increasing cases (cases 1 to 23) does not change the outcome as well for gcc. But here msvc has introduced a performance optimization: it decreased the input value by one to get a range of 0 to 22 and then created a jump table, so a LUT with execution addresses. (I am not going to detail the assembler commands logic here, but you can use the C++ code below and take a look yourself. :) )

So even in this simple example we can already see how different compilers may implement switch cases differently depending on its structure. Even though gcc chose the apparently less efficient solution here, usually one may trust on the compiler choosing the most efficient switch implementation. ;)

As far as I know, we would not even get the chance of similar optimizations if choosing if-else ladders directly instead of a switch-case structure. It would be interesting to put this to a test though and see whether some compilers translate if-else ladders equivalently with the performance benefits that can currently come with switch structures.

The inflated code:

void check_uv();
void check_holograph();
void check_stripe();
void check_watermark();

void switch_test(int banknoteValue) {

    switch (banknoteValue) {
        case 1:
            check_uv();
            check_holograph();
        case 2:
            check_stripe();
        case 3:
            check_watermark();
        case 4:
            check_watermark();
        case 5:
            check_watermark();
        case 6:
            check_watermark();
        case 7:
            check_watermark();
        case 8:
            check_watermark();
        case 9:
            check_watermark();
        case 10:
            check_watermark();
        case 11:
            check_watermark();
        case 12:
            check_watermark();
        case 13:
            check_watermark();
        case 14:
            check_watermark();
        case 15:
            check_watermark();
        case 16:
            check_watermark();
        case 17:
            check_watermark();
        case 18:
            check_watermark();
        case 19:
            check_watermark();
        case 20:
            check_watermark();
        case 21:
            check_watermark();
        case 22:
            check_watermark();
        case 23:
            check_watermark();
    }

}

Yes, I did type it out manually (not really, I just copy pasted it and changed the according values)

We have a binary file that has to maintain compatibility with a 16 bit Power Basic app that hasn't been recompiled since '99 or '00. We have storage for 8 character strings in two ints , and 12 character string in two ints and two shorts.

Damn, that are setups where you have to get creative.

I was hopping it was somehow badly implemented in python and each char ended up occupying 2Gb

Hmmmmmmm, that sounds like another fun idea. Trying to make storing a single char as inefficient as possible.

Could you spawn a container to run the function that packs the char?

You’re gonna need an entire ansible infrastructure to stand up the host that runs the podman container too.

Don't forget to make a C interface for that Python library, so that it can be used in Kernel code.

Its already written in C1

dammit yesterday was too long i thought this was a dnd joke at first

Me too! Haven't had my coffee yet. I was like

"... character...? Charisma...? (blink blink)"

coffee? Fuck that's what's going on i knew it. hold on

That's where std::vector<bool> or bitfields come in handy!

An enum and a logical "or" operator.

In typical C fashion, there's undefined behavior in turn_char_to_int. xD

did not want to always scroll past that behemoth of a switch case xD

Yes I know, the code is probably bad, but I do not care

That's why we love it.

My colleague said he didn’t see the point in storing enums as shorts or bytes instead of a full word, so I retaliated by storing them in their string form instead, arguing that it’ll be compressed by the db engine.

That's bootleg gzip.

Mostly because compilers do this kind of stuff if you optimize for space, iirc. Not that you should never do it or something, but it kinda looks like premature optimization to me.

No, it doesn't. It can't know if you need a full set of characters or only a subset of them, so it can't optimize like this. If you know you only need to represent capital letters and a few punctuations, you can do something like the OP. The compiler has to assume you could need the full range of characters represented by the format though (especially since it doesn't even know if you'll continue to use them as characters—you may want to do arithmetic on them).

Definitely premature optimization, and not even ose to optimal either. It's next to useless, but I think the OP was just having fun.

Idk, I just had this funny idea, and thought I could do this as a cool and quick proof of example

I mean… you’d get better results for large data sets by just using a known compression algorithm. This is only viable for situations where you only have a small amount of data, enough computation to run this conditional, but not enough computation to run compression/decompression.

And here I was wasting time with bit fields to make my bools smaller.

At first I thought, "How are they going to compress 256 values, i.e. 1 Byte sized data, by "rearranging into integers"?

Then I saw your code and realized you are discarding 228 of them, effectively reducing the available symbol set by about 89%.

Speaking of efficiency: Since chars are essentially unsigned integers of size 1 byte and 'a' to 'z' are values 97 to 122 (decimal, both including) you can greatly simplify your turn_char_to_int method by just substracting 87 from each symbol to get them into your desired value range instead of using this cumbersome switch-case structure. Space (32) and dot (46) would still need special handling though to fit your desired range.

Bit-encoding your chosen 28 values directly would require 5 bit.

I'm a simple man. Here's my list of variable names:

Var1

Var2

Var3

Var4

Var5

...

Use strings for everything and use a single universal method to convert some to floats only when you absolutely have to.

Hey, this is awesome for saving space when writing things to NFC tags! Every bit still matters with those suckers

I have a coworker who does stuff like this and it's always low-benefit optimizations that cost the team time to interface with - but I do still kind of love it

I feel like many programmers (or their management) have grown ignorant to resource limitations over the past decade or so.

Obviously there is good examples like many linux distros running well on 4GB RAM and the like, but when it comes to windows, websites and proprietary programs, they gobble up insane amounts of RAM to provide almost the same functionality as in 2010.

It's just not on their radar at all these days. You want to develop and iterate quickly, so you're not going to program everything from scratch. No, you grab an off-the-shelf framework and implement only your business-specific things in that framework. There's so many layers of abstraction that optimization becomes impossible (beyond what the framework does for you), but it saves you a ton of expensive developer hours and gets you to market really fast. And when someone complains that your website doesn't perform for shit, you just blame their hardware, right? Externalize those costs.

they gobble up insane amounts of RAM to provide almost the same functionality as in 2010.

Critical to using our service? Maybe even an operating system?

ELECTRON APP!

4GB to run well... I remember happily running linux on 4MB of RAM, complete with X and web browser. I also remember running BeOS on a machine with 64MB of RAM and having one of the the best desktop experiences I've ever used.

I agree 100%! Butt I'm joking about a façade of optimization. Making code confusing and hard to interface with by making up custom data types. And for more context, their main project is a UI that takes >10 seconds to load and uses 2+GB of RAM. But at least the UUIDs in the SQLite DB are stored as hex instead of strings 😅 (even though I think everything in SQLite is actually stored as a string under the hood?)

I do still admire the desire for optimization - but it might be some sort of coping mechanism to ignore the insanely unoptimized bits of the project

C lets you do this by putting text in single quotes:

int foo = 'Abcd';

works

But that's only four chars in four bytes. This absolute madlad has put five chars in four bytes.

in the same vein (storing more data in less bits) you should check out tagged pointers as well!

I don't think that's a useless implementation at all. code looks relatively clean, and it definitely has its uses in the embedded systems world.

It is neither useless nor funny. It's optimization for storage capacity. If everyone in the world put in that level of effort into compression, computer storage and processing would actually be faster than the previous generation.

What was that? I was busy installing a package in my node.js cluster to convert the first letter in a sentence to uppercase.

I feel attacked

Processing would be quicker?

This would add significant processing overhead because of conversions everwhere. There is a lot of string processing going on inside your CPU already, I'm reasonably sure this would add a measurable overhead.

Plus I believe this would cause even more string related security vulnerabilities to emerge because of additional code that needs to be maintained.

Unless you only copy and compare you have to decode it, or implement more complicated logic for everything from searching to concatenation (which is normally just memcopy).

This will actually run much slower. It saves space, but it has to do a bunch of math every time it needs to store or load a string I stead of just outputting it. Maybe if you had really limited cache space this could run faster (since it could save on fetching from RAM/storage), but, unless you're storing some really long strings, it won't make a difference.

now use the intchar to store prefixes for a smart string , or pack them to make a simd optimized b tree with 40 string prefixes per node instead of 32