Suddenly, during a quick demo, I realized something was off.
Oh boy, the game started…
After an initial smoke review, I realized the system wasn’t working as expected for exporting files. I exported two files in different formats - one worked, but the other kept loading forever.
Our notification Slack channel started queuing messages, letting us know we’d be here until late night on a Friday.
After my first investigation, without panicking, I realized it wasn’t just files - it was only PDF files. I exported a Word document from our internal system, and it worked. I exported a CSV file, and it worked as well. I started a job for generating a PDF again, but it kept loading forever.
I went to our staging environment and exported one PDF document - it worked.
My head was spinning. Given that I had an ongoing demo, I had to jump out from the problem, so I let Martin (a skilled engineer who has been a part of Datia from almost the beginning) investigate this issue with me.
Up to this point, it was 3:33 PM.
After a few minutes, we started investigating across our instances.
“It might be Nginx?” we checked our servers, and the access log appeared to have errors mentioning an upstream timeout error.
Weird (x1).
We kept investigating until we reached the usual suspect (and something that had caused headaches before) - AWS WAF. WAF is a web application firewall provided by AWS that essentially helps us block, control, and prevent harmful requests from the open web on our public domains.
After a few minutes of digging with it and a few unsuccessful trials, we decided to step further and investigate our buckets. Recently, we had run into a few issues where AWS threw us an error regarding hitting quotas for S3 buckets.
Weird (x2), since we thought the cloud was scalable up to infinite. When in reality, it’s someone else’s machine.
We checked the logs and found our first hint: an unauthorized error from AWS after an apparent bottleneck. We looked at each other’s faces in the meeting and said in unison:
“It must be S3.”
Narrator: it wasn’t
We ran to try it out on a new bucket and wrote a few scripts to test it out. We ran back to staging since it was the environment that used to work before but now suddenly wasn’t working anymore - it kept loading forever.
Our assumption was, “Okay, we might be hitting another obscure quota from S3.”
Let’s create a bucket from scratch, but first, let’s try it out against these buckets from a virtual machine.
We did it, and it worked like a charm - same document, even. Now our confusion was extreme, but we had to keep digging like Sherlock Holmes digs into crimes.
We ran into the questions that one essentially does not encounter in a serious programming environment (no pun intended, JavaScript boys!):
undefined
stream?null
one?Up to this point, it was 6:40 PM for me (in Sweden) and late for Martin in Argentina as well. We started running in circles.
We decided to zoom in a bit and start testing in a production environment directly since it was outside business hours (and because I didn’t mention: our locals worked like a charm!)
We modified the code and started using a non-library strategy to upload the stream into S3 (which, by the way, is messy and not that straightforward).
A few tries, and still nothing - it worked on our local environments but didn’t work on the cloud.
Up to this point, I was getting frustrated since it was Friday, and we were still here, debugging a JavaScript mess. Frustrating enough since it feels like in JavaScript, you never have control over what’s going on.
In C, you can go ahead and misconfigure a makefile to output debug files, a segfault, or a bad reference, but at the end of the day, it’s your fault.
The same goes for any serious language ecosystem out there.
But JavaScript is quite the opposite: it works like magic (and not as a compliment!) even though you know it’s pure software!
I ended up believing, “Maybe AWS blocked our public domain, and from there, we could not upload files? What if…”
Anyway, I tried on a virtual machine we have, but nothing - it worked like a charm. The funny thing about this is that during this process, we had three pipelines and workflows for the weekend running in the background, and none of those broke or raised an error.
Arriving at 8:15 PM, we decided to wrap up for the day and continue individually. Each step we took didn’t work at all.
So I decided to start removing and cleaning the things we did for testing, and I took a five-minute walk to the supermarket.
I came back and said, “Okay, let’s start again.” I ran my local against our production environment and emulated the scenario - it worked, even with our domain.
It wasn’t an issue with S3; AWS was discarded. It must be the documents service or the code…
We discussed a bit more with Martin, chatted, and then I continued with the experiment…
I deleted all cache on my local environment, all node_modules references, all lock files. I ran an installation process once again and exported again… it didn’t work.
The next message I sent: “We found it!”
Now it was a matter of understanding what was going on since it wasn’t a server issue but a library issue.
Doing a git diff
, we could observe a few changes on libraries related to:
@smithy/util-stream@2.2.0
@smithy/is-array-buffer@2.1.1
...
All of these because @aws-sdk
got updated a few months ago… Weird (x3) and extremely closely related.
We kept investigating and found that the actual library we used for rendering files inside the server had changed as well, and it changed… the streaming pipeline
.
Oh boy, we were close.
We kept digging and decided to downgrade the packages to the latest available version on the internet before these changes. We modified the package.json
and pushed to production (yeah, up to this point, there was no other way around).
…
After a few minutes, the deploy failed: github timeout
.
Oh my. Anyway, we started a new workflow, and it worked.
We went to our production environment and checked - it didn’t work, but now it was no longer a timeout but a direct error on a function in charge of handling the file streaming.
We were so close.
Checked the logs:
this.subset.encodeStream is not a function
Oh my, holy grail… What now?
I did a quick search, and apparently, if you want to use a specific version of the library with a specific version, you need to use a fixed version for registering fonts in the document…
Anyway, it was around 10 PM, and the only thing we wanted was to close our computers and sleep (for me, I don’t know about Martin).
We did an upgrade and installation of the library, pushed, ran the installation process - it didn’t work.
Well…
After a few minutes, we cleaned up the pipeline cache, restarted our environments, and even cleaned up the load balancer (WTF, just in case).
I pushed again and deployed. It worked.
Up to this point, we weren’t even surprised; we just wanted to leave.
We checked the document - it was all empty except for SVG inside.
I’m thinking about opening a restaurant now.
We checked again, did the entire process from scratch once more, in addition to the other downgrade/upgrade actions we took, we had to upgraded a library we didn’t think of was related with this mess, and now - it worked. The document was there, and now everyone could have a PDF.
The system was stable again.
Our sanity might not be.
It’s almost incredible to believe that this ecosystem is so broad and spread out across the globe. It’s awful, and the developer experience is close to being a paranoid android trying not to shoot himself in the foot with a bazooka.
JavaScript sometimes feels too malleable, too easy, too clean, but it’s not. The paybacks come later in the game.
Now our strategy will surely be to use at least a typed system or a more robust runtime. This mess cannot be there any longer, and if it is, JavaScript won’t be the tool to support it (since it’s strictly a bad tool).
It feels like you need to know too many nitty gritty details (and the devil lies in those details!) to truly understand what’s going on - and it’s not about being an expert, but an expert on the minute particulars.
Why didn’t this library work in conjunction with the latest version of AWS? Who knows? Not even the developers of the library might know, because the JavaScript ecosystem is so badly interconnected that everyone relies on blindly trusting each other’s code and infrastructure. It’s not even close to the experience of using robust package managers like Cargo (Rust), NuGet (C#), or others.
JS works, and it works well for a lot of things. But don’t try to build a truly scalable system with it. It’s simply too difficult to avoid shooting yourself in the foot entirely when the ecosystem is this fragile and delicate. One tiny version mismatch or breaking change can bring your mission-critical application to a standstill, as we painfully experienced.
The JavaScript world moves too fast, with dependencies changing constantly and breaking backwards compatibility. This makes it extremely high-risk for any project aiming to be an enterprise-grade, long-lasting solution. The deck is stacked against you when you have to wrestle with the ever-changing currents of the npm universe and its nitty gritty issues.
It feels like you’re constantly swimming against the current, needing to be a domain expert on the most granular details just to keep your application afloat. It’s an unnecessarily high cognitive burden that often outweighs the benefits JavaScript provides.
Don’t get me wrong, JavaScript has it’s a good tool sometimes. But we’ve learned the hard way that it simply isn’t the right tool for building robust platforms that need to stand the test of time.
The ecosystem is a beautiful mess, but a mess nonetheless. For mission-critical applications, the costs of dealing with its chaos and nitty gritty issues are too high.
At this point, it’s not even about JavaScript itself, but about the fundamental lack of trust a developer can have with its tools. Imagine a carpenter with a hammer that drives nails most of the time, but then randomly and unpredictably strikes the carpenter instead.
Why would that carpenter want to use such an unreliable and hazardous hammer for every piece of furniture they build? It makes no sense. And yet, that’s the situation we find ourselves in with JavaScript.
I don’t know if upcoming projects like Bun.js or Deno will do anything to rectify the situation. I simply believe the JavaScript language itself has become too amorphous and monstrous to not consistently breed confusion across builds and deployments.
]]>Sweden was exactly what I expected before coming here: winter, darkness, and complete silence while I spent time building tech for Datia.
I will miss very much the walks in Vinterviken or being surrounded by such an amazing nature.
The funny thing about all of this is that the weather is not why I’m writing this; that would be falling into a cliché. In the end millon people live in this city. It cannot be that bad. Maybe it’s a good excuse and perhaps it works as an afterthought and in chit-chat conversations (since Swedes are experts in talking about weather) but in a deeper thought experiment the weather is almost used as a joke.
In my view there’s one main personal reason (apart of my partner living in another country and us wanted to be together) why I decided to move away:
No third places.
I grew up in the countryside, near Rosario, a really nice and calm city where everyone was wandering around, talking, greeting, and hanging out in places. Those places I found later on in the city and in every place where I lived. I’ve always found places to hang out. Except here.
In Turin, it was Il Parco Valentino, Porta Palazzo, the many coffees in the street that you can stop-drink-go, the trattorias, and at least ten other different kinds of places.
From mountains to forests, clubs, frozen lakes, etc.; I knew something was happening everywhere. Energy. People laughing. Just hanging out with strangers in the same physical space (even if the time was not being directly shared).
Living abroad makes you feel quite disconnected from everything, and it feels strange to be an outsider. However, you want to feel comfortable enough to feel close to home.
That’s why, abroad, you end up listening to music from the country you’re from, eating the things you didn’t like before or, in my case, drinking mate (much more than at home).
Escaping this reality and living outside your comfort zone is nice for a while, and then suddenly you start to feel trapped in an infinite loop of dissatisfaction.
Where you were is not the place you want, and now the place you want is no longer the place you wanted to live before.
It’s a dichotomy, and it’s because traveling changes you. However, don’t travel in a touristy way. Travel in a way that you feel penetrated by the culture of the place you are in. The feeling of being part of the routine and not just an NPC.
Now, Stockholm; you might as well be replaced by a flower vase in the subway, and no one will ever care.
A lovely city, with awesome transportation that can take you everywhere you want, a city almost taken from a dream and put onto earth with lovely people and well-designed interiors, with peculiar architecture and astonishing cleanliness all over the place.
You might have events, clubs, bars, or even activities such as bouldering (with an amazing boulder place), although everyone is going somewhere, and you are not invited. If clubs exist, which I know for sure there are many of them (like sports clubs, board game clubs, etc.), they might require you to have a ticket or a schedule.
Disclaimer: there may be third places that I am not aware of and premises. If so, feel free to let me know so I can visit them next time!
Learning how to be by yourself makes you better, although during the winter, practically everything dies out (except for a few parties), and even if such activities exist, your soul is being crushed by the furious winter.
I was fortunate to have a few friends in the city while we were building Datia, with an office fulfilled by amazing coworkers. However, the sensation on Friday evenings was awful.
My sensation after spending almost eight months here is that everyone is invited to a party that you are not.
There’s no synchronicity, no spontaneous meet-ups, no random strangers you might encounter.
Before any Swedes get mad, don’t get me wrong – it’s a perfect life you are living here. You don’t have chaos. You don’t encounter aleatory events in the street that might delay you from doing your thing, and even the lunch breaks during weekdays are optimized with those peculiar buffets you have.
It’s also a lovely country, fulfilled by crazy good nature, and it’s crowded by awesome people. However, I learned it’s not for me.
Nowadays, I believe I feel accomplished in the end for the decision. I managed to live here for a while and understand a little bit more about myself.
Moreover, I started contributing to Wing, I did my virtual machine in Rust, and I even wrote and coded an essay after years of not writing anything that technical.
Places are just places, and we are the things that happen inside those, and that’s why I loved being here because I understood how to love your own place. Your home.
Spain, you are next.
]]>No thing beside remains. Round the decay Of that colossal wreck, boundless and bare The lone and level sands stretch far away.
These days, I have been reading a book that is affecting my day-to-day; this book is about finite games and infinite games, but putting a name to it is pretty much meaningless.
The book is much about how we can interact as a society to boost everyone’s potential.
Idealistic? Yes, although as I read it, I laughed through the texts during my commute to work. Not humorous laughter, but sincere uncomfortable laughter. Like when someone tells you the uncomfortable truth, and you know it’s true in the end, that kind of book it is.
Reading nowadays this makes me realize how few the society has changed during the past 40 years in terms of self-understanding.
Everything is about titles, competitions, now even more than before, and finite games as such that as it is for everyone now, it’s worth close to zero the leverage that can give you to be an infinite player.
There’s no point in not competing in your day-to-day basis since the system will kick you out (sooner or later).
Even researchers are leaving for big tech. However, this is not new, and I won’t go into this topic since it’s heated.
There’s no other point in history where knowing things could give you exactly as much as not knowing it, and I’m not talking about money but actual knowledge.
Not for lack of information, but on the contrary, people tend to choose not to learn because they have information exhaustion. Everyone is already so focused on their own personal finite game that has the same starting point as everyone else.
Lack of time is killing people’s ability to be creative, and those who want to be creative are aimed to suffer the pains of being outsiders.
A few years ago, almost like 12 years ago now, on the Internet, people were amused when someone shared the secret sauce on how to build a complex piece of software. Not because it wasn’t available in the first place, but because it felt completely robotic or uncool since no one was “writing an easy way” to do it (GNU was open, open source was flourishing, etc.). It was not common to find good content without getting you into the roots of algebra, calculus, rooting for you to learn first how to read assembly before going into C; if you weren’t part of academia, or any group of knowledge, you were pretty much in the shadows trying to feel the walls around you.
It wasn’t impossible. It just was hard enough.
Nowadays, impressively enough, it takes you two searches to start understanding how a nuclear reactor works…
I remember back in the days when I wanted to understand more about compilers, everyone recommended The Dragon Book. Yeah, sure (myself of 15 years old disagreed with that). It seemed like everyone wanted to be a gatekeeper of knowledge back then. If you were there, you needed to know how to ask the question, how to be the person they expected you to be. A secret handshake of sorts to gain entry into the halls of knowledge.
However, such gatekeeping is human. In antiquity, the Babylonian scribes (Eduba) guarded scholarly knowledge as a means of maintaining power and prestige.
It’s this gatekeeping that ended up being commodified, a business model - something that allows organizations like the one I just cited above, JSTOR, to be alive to this day.
Thanks to Aaron Swartz and many more fighters that choose to play the infinite game, we can now rely on other mediums of information that do not charge 50 USD per article fees to access human knowledge.
Information around was almost sacred for non-academic people or people like me in the south of the world during the early (or not so early) years of modern Internet.
Maybe during the witch hunts or even during the attack of Caesar on the Library of Alexandria, people visualized these attacks with the same eyes as we are doing it right now to our own society, knowing it’s bad, trusting it’s bad and letting all the bad happen because, of course: how much can you do?
Now, let’s imagine for a second: What would happen if another Alexandria occurs now?
Maybe Babylon had a real tower, but we will never know, and even if we know, nobody will ever know the difference.
Society, if it keeps learning no matter what, will end up being so self-aware that essentially nothing will be worth it to be learned anymore.
Now we have perpetrators of information called Models, that are just a reflection of a society that’s focused on the outcomes, not the process and not even quality. Just spit the code. Spit the text I want.
Trust that I’ll trust you, machine.
Who will be writing now in the future? Who will be playing for the machines? Will people label stuff if they are aware that they are feeding the machine that will eat them later?
We talk about games all the time, although the rules are already set, and we cannot change those.
I’m personally all the way towards technology; however, we need a set of guardrails not for the technology itself, but for us to understand where the boundaries are and where humans need to use these tools to explode their capacity and not the other way around.
Regulations won’t work.
Limiting these tools won’t as well (since it’s like limiting a computer from calculating two variables).
Opposition to it will end up in hunger, war, and broken societies.
We need a new way to understand how states can be built from scratch using knowledge as a source of truth and these tools as a thesaurus of human information.
However, in the moment where money and marketing hit the beehive, we will need to run since bad things will happen.
After all of this, where is the infinite game now?
]]>A few weeks ago, embarking on a trip to Barcelona, I found myself in an unusual situation: surrounded by the hum of an airplane and without my usual digital distractions. No music, no movies, no nothing, just me and my thoughts. The trip was scheduled to take at least four hours, and I finished the book I had before I reached the gate.
This unexpected digital detox led me to a simple activity: browsing the photo gallery on my smartphone. I know, I could try to sleep, but it was 8 a.m. and after a few coffees!
It was during this mundane task that I realized something that would soon become the focus of my latest project: our digital lives are cluttered with an excessive amount of nearly identical images.
The seed of this idea began to grow in my head over the next few days: at first, it was just a conversation with myself as I walked along, pondering the necessity and redundancy of storing such a quantity of similar pixels and whether it was necessary in the first place. But as I shared this idea with others, discussing its merits and potential impact, it began to solidify. It even passed the rigorous filters for evaluating new ideas, proposed by Paul Graham, which affirmed its potential (at least in my eyes).
Embarking on a research journey (which was almost entirely new to me), I set out to understand if this concept had been explored before, without wanting to reinvent the wheel. Surprisingly, despite the plethora of tools designed to manage digital clutter, none reflected the vision I had. My project was not only intended to identify duplicate images, but to rethink the way we approach digital storage, emphasizing the quality and uniqueness of our digital memories over mere quantity.
The idea is quite simple but, in my opinion, powerful enough to merit an article describing it: video has been using the inter-frame almost since its origins, but we don’t have something similar applied at the storage level. Here goes my article:
Armed with pencil and paper, I delved into the mathematical representation of this idea using concepts I saw in Information Theory. This phase was crucial to transform a fuzzy concept into a tangible model that could be applied and tested. The challenge was (and remains very real) to create an algorithm capable of discerning and eliminating redundant digital content, thus cluttering our digital lives in a meaningful way and allowing our storages to be more flexible and less redundant.
Writing the essay served as the last step in this process, helping to organize my thoughts and research findings into a cohesive narrative. Like the last piece of a puzzle, it brought clarity and direction to the project, setting the stage for the practical challenges ahead.
Now that I am about to put it into practice, I am aware of the work that remains for me to do. Developing examples, potentially even a library, and integrating this concept into a usable tool will require careful planning, coding, and testing. Also, the importance of peer review cannot be underestimated here, I need someone else to review and help me go further with this idea. It’s fun to share this journey from a simple observation during a flight to a project concept.
It’s crazy how much we can accomplish with a few hours disconnected from any kind of distraction. Even if this project is reduced to zero after a few tests, the amount of knowledge I gathered during the research phase was enough to consider it important and powerful for me.
Just hearing that airplanes are starting to include free wifi makes me wonder what the next place of zero distraction humans will have before technology disrupts again… because, at least for me, flying is the one and only activity where I disconnect from everything else.
Sometimes all we need is a spark without too much wind to generate a fire that lasts forever, the rest is just curiosity.
]]>I can safely say (despite having been able to verify it with my own hands), that reality is coming to an end. Reading and seeing memes out there with a video and a prompt made me realize that we won’t know anything in a few years. We won’t be able to detect whether a video has been generated or not and even be able to describe or remember where the piece of footage came from. (E.g: twitter post)
Any video that resembles but doesn’t mimic reality will be a possible candidate for AI.
Movies were always part of this “reality,” but with a consensus that to see fake reality, you will pay a ticket, rent a movie, or whatever.
Now, you won’t be able to accept terms and conditions.
Reality (despite being bombarded before) has ended as we know it.
A new set of technologies will be needed to check and verify reality as we know it; otherwise, nothing and everything will be possible.
There’s no point in fighting back, and this won’t harm the common people (potentially, in the beginning). However, public figures, politicians, etc, will be in danger, and if I stretch this a bit more, a new Goebbels is being raised with these new tools and technologies.
These AI generative tools are Bernays’s wet dreams.
I guess it is just a matter for us to wait and see and wait for the best.
In the meantime, it is reasonable, safe, and sound to build verification, certification, and validation tools to avoid scapegoat reality.
But being fair, I’m on the e/acc side, effective accelerationism, so I don’t have fear; I kinda like where things are going because I will end up being part of a new humanity, a new society with new problems, and a new set of challenges.
There’s quite a nostalgic aspect to seeing old Londoners walking around the city in 1920. In my opinion, there is a break in time from there to now, and people in the future will see this invention as part of the same nostalgic look: a break that we needed to jump into the new century.
We have just started a new revolution, not industrial or even material, but mental.
However, what does it matter by now?
]]>Note: I attached hyperlinks directly into the text and not as references.
In recent years, there’s been an odd trend in the tech world, particularly with front-end technologies.
We have a parade of brilliant engineers, but they are marching in circles over the ground we have already covered, doing the same thing each time, a little more abstractly than the last time we, as engineers, first saw it.
There is nothing inherently wrong with the existing technologies (in fact, I won’t mention a single one here) that we have used in the past. Still, I find it odd so much brainpower is devoted to solving problems that were already effectively addressed a decade ago (or more!).
It’s like watching a bunch of mathematicians trying to prove the Pythagorean theorem over and over again.
This reminds me of Jacques Ellul’s insights in “The Technological Society”.
In a way, we’ve got a bunch of builders, each claiming they’ve invented a new type of nail when we really need someone to figure out a better way to build houses. Focus, for example, in the complete dumpster fire of the scientific journal academia.
Might now be stuck in this loop of reinventing the wheel, not because we need a new wheel, but because creating it gives us a sense of accomplishment. Which it’s ironic.
We can now claim LLMs were one of the latest “achievement of the technological society” when, in reality, it was proved by Shannon in 1948 that English, and almost any human language, was largely predictable. Strictly not the same as having a chat that does stuff for you, I know, I get the point of modern language models, but still:
I could continue. However, the point is to ask one simple question: What happened then?
Still, these days, we have plenty of research, advancement, and science, but still, it’s incredible how little “information” we can really get.
Even the Attention Is All You Need paper was “demystified,” which, if I hadn’t read the article, I’d claim a good candidate.
Nowadays, if you chat with an LLM, how do you know it will be truthful?
Furthermore, I’ll end with a conclusion in the shape of a question: Where’s or Who’s the next Bell Labs?
Maybe it’s just simply too much information for everyone to care about.
Maybe it’s just the system that enables us to behave like selfless agents.
Perhaps it’s simply just us.
]]>Last night, I was watching Anatomy of a Fall.
During the film’s first minutes, we hear someone playing the piano entirely out of tempo.
As soon as the movie develops, the piano gets brighter, funnier, and even with a message.
The player is learning, and the piano becomes an instrument where this character can talk with the audience.
He hit the wrong notes, of course, but also he was dragging the keys to be an utterly unarranged set of notes.
In the end, as Debussy said, music is the space between notes or “silence.”
When I started playing the piano (or at least learning to play the piano), each key had a meaning, and after playing the wrong notes for a while, I started to hit the right ones. However, after a few minutes of trying to find the correct ones while reading the music sheet, I realized it didn’t sound like music, or at least it didn’t sound like the piece I was looking for. It was a John Cage piece, at most.
After hours and hours in front of the piano, increasing each movement’s speed and the keys’ overall correctness, I started to hit the notes I wanted correctly.
I played for a few hours more, and then, suddenly, the music began to appear. Note after note, Erik Satie’s Gymnopédie No. 1 at the correct phase started to appear.
Before practicing and reaching the correct tempo, the same notes I was playing didn’t sound like the piece at all.
Now, the exact same exercise made me think that speed is as important as being correct.
It’s been a few years since I stopped making music, producing, and playing altogether, although a good discovery gave me pause for thought last night.
It’s not correctness but speed that makes your work outstanding.
Furthermore, it doesn’t make sense if I now replicate what Darwin did, researching and getting those conclusions in the Galápagos, even if the DIY culture started to gain so much traction in the past few years.
At most, it will help you at a personal level, but it won’t reach any boundary or frontier; it will be a more (and totally valid) personal journey.
Now, to contribute ideas and create something, you need to combine two characteristics: you become good at the Pareto frontier if you are knowledgeable at something, but you also need the speed to keep it up.
Ideas come from the same family of thoughts: you can be correct in what you think you are saying, but you require speed to practice all that you claim.
Speed does not make you good by the sole fact of being fast, but it makes you fail faster than others and try more.
Otherwise, if ideas are not being implemented, or you are too slow to put them into full flesh, you will end up with just one, maybe two, at most, trying to be your best ideas.
Yet, it’s not about being first; it’s about doing it in sync with other needs of society. Being first (or failing first, to put it in better terms) without the world being ready for your idea will make you more likely to fail and start again.
Ideally, you will need to fail.
Practically, you need to fail.
If you are too slow to start again or to adapt to new environments, your entire lineage will be gone by the end of your life cycle.
Speed and synchronization would make you flexible to the environment.
Add correctness and get the right amount of knowledge; you will have the perfect combination to do meaningful work.
]]>As a child, I have read many stories, delving into everything from geography to Spanish literature, crime novels, and global history books. Each day, before or after school, I’d go to the local library, eagerly choosing the books that would be my companions for the week.
A few years into this routine, I transitioned to a Kindle, lured by the prospect of accessing a world of books at my fingertips. Yet, unexpectedly, I found myself reading less.
Despite having the world at my hands, I didn’t have restrictions. Everything was as free as I could have wanted.
Each morning during my childhood, I’d rush over to my friend’s house, where a computer, a fascinating device to me, awaited.
He lived just two houses down from my place, and thanks to his computer-enthusiastic uncle, his computer was always the best in the neighborhood.
I remember waking him up early, eager to play games on his PC, which could handle everything from Gears of War to GTA (which, for me, was always a new experience).
When I returned home, I often had a feeling of being incomplete, wondering why I could not have a similar one. At that time, I did not understand how difficult it was for my father to satisfy my unlimited needs.
My computer was only suitable for emulator games like MAME32 and Counter-Strike. As a child, this felt like a significant challenge. Determined, I chose to confront this limitation. I began tweaking and modifying my father’s computer, finding ways to entertain myself offline, experimenting with regedit, and discovering hidden Windows features. During this time, I also started cracking games, fascinated by how altering a .exe
file could drastically change a game’s behavior. Some were simple – drop and replace. Others required disconnecting from the network and following a complex process.
This was thrilling for me. I didn’t fully grasp the technicalities of injecting code into executables to bypass authentication, but it felt like a magical skill acquired through trial and error.
Months later, very good with computers (in comparison with the people I knew), I turned to my father’s library. There, I began my journey into programming, starting with C++ and Visual Basic. I didn’t understand then, but I quickly recognized the immense potential of programming.
To have the power to say whatever you want to a computer and have it just obey your commands.
Much to my mother’s discouragement, I devoted countless hours to it, preferring coding to outdoor activities, even on perfect summer days.
Rainy days became my alibi, offering more time for coding.
My true passion ignited with Unreal Engine 3.
Before, I had dabbled in modding, code modifications, and cheats. But at around 14, coding in C++ with Unreal Engine was a revelation. I devoured tutorials, guides, articles, and game source code.
I spent considerable time on a Chilean game development forum and the ADVA forum, both pivotal in my journey. There, I met incredible people who made me grow.
There, in the forum, I learned to fail. To be criticized. To be taught, and despite my limited technical skills (self-taught mostly) and English proficiency, I continued using these restrictions as stepping stones.
Eventually, my graphics card’s restrictions with Unreal Engine 3 became evident. The need for a more powerful computer, yet financially constrained, made me switch to XNA (C#), marking another pivotal shift. The challenge of creating a game from scratch, building everything from the game loop to sound emitters, was thrilling for me.
Years later, I created numerous games and video tutorials, sharing them on YouTube. This period was entertaining and productive.
During my college years, because I wanted to stop burning my brain with knowledge, I decided to start recording music with some friends. At first, it all started as a hobby, but after a couple of months, we started to do really well (not the music, but the flow of the recording).
It was a sanctuary.
We would record, talk, discuss, improve, repeat.
At the time, we also recorded some podcasts (with whom we are now great friends).
All these episodes in my life made me good at one thing: understanding good records.
I had to spend so many hours recording with bad microphones and low-tech equipment that we ended up getting excellent at recording podcasts and music.
I had to learn everything about recording and, from there, adapt to the low tech I had at hand since I didn’t have a dime to spend on this.
Now, I consider myself a person who can understand mastering processes, audio frequencies, and recording techniques, acoustics, all because I didn’t have money to buy a good microphone.
A couple of years later, I realized something: restrictions made me curious.
Richard Feynman said
I bought radios at rummage sales. I didn’t have any money, but it wasn’t very expensive they were old, broken radios, and I’d buy them and try to fix them
and from there he wrote one of the best stories of his books: He Fixes Radios by Thinking
For me, each step of my journey involved overcoming some restriction, be it physical, economic, or social. In retrospect, these restrictions fueled my desire to learn, to satiate my curiosity, and to delve deeper into unknown realms.
I developed a practice: after consuming content like an article, blog post, essay, or video tutorial, I’d recreate the process using a different conceptual idea.
For instance, if the tutorial were about a game, I’d create a tool to make the game. Not the game
Why? Because of self-imposed synthetic restrictions.
There’s no way around restrictions.
If a book talked about simulation of the economics of a game, I’d turn everything down. I’ll make a game about simulations.
I don’t care about quality; I care about learning the right thing.
We must impose limits on our desires and intentions, harnessing the power of our creativity.
My journey would have been different if I had started with a state-of-the-art PC with no restrictions, no need to go beyond, and no need to optimize. A microphone that could make us sound like pros.
In the context of free and abundant resources, restrictions are seen as something terrible, while, in reality, those are just as good as positive feedback loops.
The concept of “synthetic constraints” (self-imposed limits to channel our exploratory and creative efforts) has been a milestone since I started learning and thinking. There is no way around the constraints, you either win or you tell the story of how you could have won to your relatives.
Keep with restrictions, and stop when you feel ready
]]>In 1967, Pamela Huby said this about Aristotle in “The First Discovery of the Freewill Problem.”
…he seems to be on the verge of saying that we cannot use the terms ‘voluntary’ and ‘involuntary’ literally of parts of the soul and to be treating ‘voluntary’ as meaning by definition, ‘that which originates from a man’s soul’, so that any application below the whole-soul level is ruled out. Whatever part of the soul it is that originates the action, the action remains voluntary.
Even today, there is no clear definition of what’s “voluntary” and what’s not; are protons “voluntary” or not? but a consensus has been made in all types of societies to be as arranged as possible. Physically speaking, it is well-discussed and studied, and I won’t go there in depth here (also, I’m not an expert in the subject).
However, there’s a specific link between motor behavior and the activation of neurons in the cerebral cortex. However, something fundamental happened between 1967 and now: humans seem to learn that they don’t have free will (and of course, this is not a surprise!).
We have now (which, by Internet standards, is pretty old) a model trying to explain almost everything.
The conception of time is intrinsically coherent with the idea of life beyond words: no other species on earth can and will be able to understand our symbols (symbolic references to knowledge), and because of this, speak and text as content will continuously be compacted to this particular society on earth.
I wonder if even our species will understand the words and conjectures of tokens only six hundred years from now.
In contrast, life places a question: Are we, as humans, free?
The first response could be: choosing.
Nevertheless, according to Bergmann Frithjof, freedom is not the act of choosing but of being able to understand essential actions, because you believe in them.
On Bergmann’s Theory of Freedom by Stephen Ball, the coined term of being free goes as
An act is free if the agent identifies with the elements from which it flows; it is coerced if the agent disassociates himself from the element which guarantees or prompts action
Comparing actions and being free nowadays, it’s starting to feel more towards what Bergmann claimed in his book On Being Free.
Individuals have the power to choose, but not because of that; they are free.
Based on his idea and view about freedom, Bergmann created a framework to do meaningful work called “New Work,” where agents in the system work towards self-confidence, trust, and independence using and manipulating technology to generate and produce the outcomes we want.
Even so, the world is moving towards a self-centered view of work, and this idea will definitely fit in in the next few years if it’s not happening now. So, while trying to keep ideas of independent production of goods always in place, it’s commonplace for those to fight against each other to decide what to choose from.
If we can talk precisely about actions performed by an individual in a set amount of time, we can predict a potentially severe set of future movements. However, movements happen (as in any complex system), making it complex to understand as time
increases.
In a theoretical exploration of decision-making within a constrained system, consider a simplified model consisting of two agents. These agents are capable of only two distinct actions upon encountering a scenario: act or ignore. Initially, the environment these agents inhabit is devoid of stimuli – a blank slate offering no information for interaction.
A single symbolic element is introduced into this environment as the system evolves. The agents, upon encountering this element, exercise their decision-making capabilities. Their actions, or lack thereof, are meticulously logged, creating a historical record of interactions. For instance, the log might reflect actions such as:
Agent A decides on Element
Agent A <act> on Element
Agent B decides on Element
Agent B <ignore> Element
This log illustrates the binary decision-making process in its simplest form. With each new iteration or epoch of the simulation, the agents are presented with the same singular element, leading to a variety of potential scenarios:
Agent B may <act> towards Element
Agent B may <ignore> Element
This process repeats, exhaustively exploring all possible actions in relation to the Element.
In this microcosm, the act of choosing – to engage or to disregard – is fundamentally binary. Given that each choice is independent (the discussion of interdependent actions is reserved for a more complex model), the total number of possible actions for each agent against an element is represented as:
Actions = (C^N)
Where:
C = number of actions (2 in this case: act or ignore)
N = number of elements (1 initially)
This formula allows for calculating potential outcomes based on agent decisions.
In this model, agents are limited to decisions regarding the presence of an Element within their universe. The scope of their choices is confined to the singular element available in their ecosystem. They cannot choose between multiple elements (e.g., Element_1 or Element_2) as the system’s design does not support such complexity.
However, the illusion of choice expands as the system evolves and more elements are introduced.
The decision-making landscape becomes increasingly complex, offering a broader array of potential interactions.
This expansion illustrates a fundamental principle: the complexity of choice within a system is directly proportional to the number and nature of stimuli in that system.
However, the decision system based on and used by humans is broken, and the illusion of choosing between one or the other Element is pointed towards the incapability of the human mind to recognize from a massive set of potential decisions which one we are choosing.
Determinism works both in the mechanical and in the quantum world. Particles don’t go around asking people names or being more of what they are. Knowledge is aligned with the beginning of its philosophical movement. Those who know will understand, and it’s about how many symbols you can recognize.
Slack of skin makes you perish without blood, sand, and salt.
Now, knowing an end, you can (and I believe it will) be shaped by a deprecated system for every word written. A system that is no longer able to maintain its empire, with weapons no other than words, that uses the luck of arrangement to be able to communicate complexity.
A symbolic lettering system (alphabet) cannot communicate pure information having infinite permutation of symbols, and on top of that, rules of which later the same system can produce pure noise. **We need to create a symbolic system that allows us to not lie on the count, not even the content or
As matter alphabets allow us to produce noise, we will be in a complex non-redundant situation. Despite this, humans need a more robust communication system with historical characteristics.
Will new empires reign without words?.
Still, grammatical rules across intersectional domains will be the kings & queens of the physical world.
A mathematical language that put us closer to a God. A logical one.
When humans decide, they tend to speak (or think the same way we speak), although this communication of ideas is always done so that the environment or even the message’s sender makes the signal noisy. In addition, languages provide humans with a means to transmit, transform, and evolve ideas from simple energy spikes into complex and robust mechanisms that fundamentally change the world as we know it. Languages also enable the creation of concepts that do not exist in the real world and comparing hypothetical scenarios.
However, current linguistic systems present a double-edged sword to humans. The flexibility of language often leads to uncontrolled noise. The following hypothetical system is proposed after discussing free will and the freedom of choice. This system restricts humans from using a fixed set of symbols and rules, preventing adding more elements.
Computers are powerful because they facilitate abstractions, the foundation upon which empires are built. However, speaking amidst noise will always be an unavoidable challenge.
In the following hypothetical construct, let’s introduce a symbolic system with a binary alphabet consisting of two symbols, <1>
and <0>
, representing true and false logic states, respectively.
This system is further defined by a set of constraints we call <grammatical rules>
, shaping how symbols are combined and interpreted.
Bypassing these rules will be penalized by the system, causing the word not to be recognized.
Using these rules defined above, we construct a table of truth for “words” within our system and their associated meanings:
Words | Explanation |
---|---|
01 | A false event with a true result |
00 | A false event with a false result |
10 | A true event with a false result |
11 | A true event with a true result |
EOS | End of symbols |
Each combination of <1>
and <0>
is assigned a specific meaning, creating a precise, unambiguous mapping between the symbolic representation and its interpretation in the domain of discussion.
In this commented symbolic system with the format “state > result,” each two-symbol word comprises a “state” (the first symbol) and a “result” (the second symbol).
This format provides a structured method for interpreting the symbols:
<1>
(true) or <0>
(false).<1>
(true) or <0>
(false).The words “01”, “00”, “10”, and “11” thus correspond to different scenarios:
This system workflow allows for precise, logical interpretation of the coordination of actions.
While precise within its domain, this system faces limitations when interfacing with other symbolic systems like Castellano or American English, for example.
The inability to map these words to systems that do not adhere to the same rules or without knowledge of the specific symbolic rules of the receptor system highlights a critical challenge: the potential for misinterpretation or “noise” in communication across different symbolic frameworks.
The constraint of not allowing permutations at certain levels and the prohibition of combining words serve to reduce ambiguity and noise, enhancing the clarity and specificity of communication within the system.
However, these same constraints limit the system’s flexibility and adaptability when interfacing with external systems.
In conclusion, in my view, society needs to create a symbolic reference system with a set number of symbols and a defined rules engine that allows us to communicate by avoiding noise as much as possible.
The system effectively reduces noise by attaching domain-specific meanings to symbol combinations and imposing grammatical constraints. It allows agents to communicate bi-directional without wasting too much energy in the speaking format.
However, this comes at the cost of reduced versatility and potential challenges in cross-system communication. The balance between precision and adaptability is a crucial consideration in designing any symbolic system, particularly in the context of knowledge creation and transfer.
Handling and analyzing CSV files is a common task in data processing and analytics. This blog post delves into several powerful tools and methodologies for manipulating CSV files (huge!).
Imagine this situation: After a few conversations with a bunch of data providers, you decided to buy a dataset from one; after the initial chat you received an email saying “Hey, this is the data you just bought”, immediately, you find yourself with a 3GB CSV file on your server waiting there to be consumed.
After a bunch of hours trying a different software (Excel, Hex Fiend, etc) you realized: you cannot play with this data manually, you need to bring a new tool to your workbench.
So, when we’re testing and playing around with data, we tend to imagine that these files are small enough to be easily manipulated through any software we use daily. We also rely on remote services where we store the data (looking at you, Google Workspace!) or even queries we do against our servers (this being Postgresql, even an API exposing this file in chunks), although doing it directly on your local environment actually makes a lot of more sense than doing it remotely.
Here’s why:
With these advantages in mind, let’s dive into it
csvtools
csvcut
csvstat
csvstack
csv2sql
sqlite
for CSV Datazq
tool for working with structured datacsvtools
- swiss army knife for CSV filesA little bit slower than the other alternatives
csvkit
is a toolkit that greatly simplifies working with CSV files. It has a bunch of useful commands that help you visualize the data you are dealing with.
Purpose: To selectively keep or remove columns from a CSV file.
Example:
Consider an employees.csv
file. To exclude the email
column, use:
csvcut -c -email employees.csv
Purpose: Generates comprehensive statistics for each CSV column.
Example:
Running csvstat
on employees.csv
:
csvstat employees.csv
This command outputs detailed statistics like mean, max, and min for each column.
Purpose: Merges multiple CSV files with identical columns into one.
Example:
To combine employees1.csv
and employees2.csv
:
csvstack employees1.csv employees2.csv > employees_combined.csv
Purpose: Converts CSV files into SQL tables.
Note: this is command is a shortcut for the next tool we discuss in this blog post
Example:
For employees.csv
:
csvsql --table=employees --db sqlite:///employees.db --insert employees.csv
This creates a SQLite database employees.db
and inserts the CSV data into an employees
table.
sqlite
for CSV DataSQLite offers native support to read CSV files, providing an efficient way to import and manipulate CSV data within a database context.
Example:
Importing employees.csv
into SQLite:
sqlite3
.import --csv employees.csv employees
Note: You will need sqlite3 to run this command but it’s the most flexible one of this list
Unix pipes can present a really straightforward way to deal with files, particularly useful for merging or querying it.
cat file1.csv file2.csv | awk 'BEGIN { FS=OFS="," } { print $1,$2,$3 }' > joined_file.csv
cat file1.csv file2.csv > file_total.csv
zq
as my personal favoritezq
excels in processing various data formats, including CSV and JSON, making it a robust tool for quick data analysis. It’s part of the toolset provided by https://zed.brimdata.io
Awesome to deal with huge JSON files or even CSV we want to convert to a more structured format.
Using zq
to filter and view data from a CSV file:
zq -i csv -Z 'total_aum_analysed>0' sfdr-2023-10-05.csv
This command sequence filters records with
total_aum_analysed > 0
and displays the first few records.
The tools and techniques discussed are essential for anyone dealing with massive CSV datasets (or even different formats).
From simple column manipulation to complex data joining and querying, these tools allow us to efficiently handle a wide range of data processing tasks without relying on external services or tools.
Whether you’re a data analyst, developer, or researcher, mastering these tools can significantly enhance your data manipulation capabilities. It could allow you to discover a new world regarding data that (at least me) never considered before local data manipulation.
In this guide, I’ve focused on tools you can use on your computer. I chose not to include cloud services or big-name data providers for a few simple reasons:
Cost: Many of these services charge money (even on testing suites!), which can add up, especially if you’re working with a lot of data or exploring what you have received.
Complexity: Some of these tools can be pretty complicated to set up if you don’t have an infra team. I wanted to keep things straightforward and easy to follow.
Privacy: Using external services often means sending your data online. I wanted to avoid privacy worries by keeping everything on your computer.
Internet Dependence: These services usually need a good internet connection, which you might only sometimes have.
So, I stuck to local tools to make things simpler, cheaper, and especially useful if you’re exploring your datasets or trying things out.
]]>