Internet: Centralization vs Decentralization, but we need both and going open source
The Internet has become all centralized, whether you use a search engine, use social networks or anything you need to.
Complete centralization has many demerits
1) Our data is no more ours
2) Data remains in the hands of the giant, so is the ranking of data. You have no control over what you want to see, it's the machine learning algorithm that they use on our big data, which can be selective and can be biased
3) Increasing monopoly, all power remains with the only couple of superpowers
4) If the central server shuts down, the internet is gone, so we need a plan B.
So, centralization has become a necessary evil, because decentralization such as peer to peer technology doesn't solve the technical issues, that centralization solve.
Instead of making centralization and decentralization, mutually exclusive, there can be another way "a hybrid of both". Hybrid makes everything open source and distributing the power to everyone.
A hybrid technology need two things:
1) Many central servers which host the index files and other services, the analogy of many central servers can be Ubuntu Linux servers, which are kept sync.
2) Your personal computer, also many other always online hosting servers, which can host the files (a distributed database) available for both download and upload,
Using a hybrid technology, it addresses many issues, such as easy to set up and cost effective. You need not have to buy or invest a huge amount on scalable servers.
It can be a disruptive technology, but requires a painstaking specification and software.
A usual browser will no more work, and the communication protocol will be completely different.
It has also many hitches such as security, encryption, data integrity etc which need to be overcome.
I have started to build a rough sketch on how it should work in a naive manner:
For making a query we need 4 operations (CRUD: Create, Read, Update, and Delete)
Create operation requires an assembling central server. The function of assembling central server is to create a chuck of a name and store all the create queries done by different clients. When the file size grows, it will be called a complete chunk which is ready to be downloaded by different clients for hosting it.
It requires table name which can be mapped with index file to get chunk name. After getting the chunk name, IP address of available clients who has the chunk is mapped and the data is fetched.
It also requires table name which can be mapped with index file to get chunk name. After getting the chunk name, it's id of the table field is obtained and stored in the delete index file.
Whenever any other client comes online, fetches the delete index file, and deletes the specific data with permission.
It's a complicated issue because we can't update all the data dispersed in the network, an update should be a creation of a new index on a table and then delete the old index of the table, and mapping of old table index and new table index is kept in another update index file. During a read operation, this update index file is also used to fetch the query.
Also, a feedback index can be kept, to keep track of spams etc.
Also downloading the updated index file must be personalized to keep the index file size as small as possible, and index file will be downloaded based upon each page you visit.
It's just a rough sketch, that came to my mind, more details and better ways can also be discovered in doing the things.