Whenever I mention watching the webserver logs I see a spike in proxies, anonymizers, and other "tricks" with which people believe they "anonymize" or "prevent me from tracking them." Now, I'm going to tell you a bunch of trivialities about the apache logs and / or the HTTP protocol, so the technicians are silent, but apparently you're not familiar with the basic concepts, so better clarify.
First, I DO NOT have a real analytics infrastructure. It means that I don't have a hadoop on which I run job with spark (for example) or a cluster of splunk with its Operational Intelligence that makes outliers detection. So I would say to calm down: all I have are servers on which I inject logs, which I keep for 15 days, and visualize with a graphana. Stuff that, given the volume of traffic, can run on an Odroid. (we talk about machines like this: https://www.pollin.de/p/odroid-hc1-einplatinen-computer-fuer-nas-und-cluster-anwendungen-810766?&gclid=CjwKCAjwusrtBRBmEiwAGBPgE0E-yFO83xfQbDknPmP2qaM-pz7dCk1PbI1oELnuyNDL4uk2Rdz_SRoCYgoQAvD_BwE , powerful about how much a new generation mobile phone).
Second: "trace" is different from "identify". Identifying means that I learn where you live, name, surname, size of tits, and so on. To say that you are drawn does NOT mean that I know who you are: it means that I can distinguish you from anyone else, even though I don't know who you are. With this I can do little, but if we talk about a google, facebook or others, the game is very different.
Normal apache logs are combined to be processed. This is what ANY web server does, so the minimum. The reason why I can see that c
It's a lesbian girl, it's not due to any devilry I do on my machines. It is due to the bad practices you are adopting.
For example, I'm noticing things like this:
I hope this gentleman is NOT believing that using his company's internal mail protects him in any way. Because in doing so, among the referers, he is informing me that his company uses zimbra as mail software. And it is not neutral information: if this information ends up in the hands of a hacker, he will go here: ( https://www.cvedetails.com/vulnerability-list/vendor_id-7863/Zimbra.html ), he will look for the vulnerabilities, and the next time you come back you will find a nice exploit right on the home page.
Moral of the story: use the fucking browser normally. Browsers at least try to hide similar information. Zimbra itself has no countermeasures for these things.
In the "client mail" chapter I find all the colors, always with unique tokens:
But here I am not the bad guy: here we are talking about an information that this person is spreading urbi et orbi for internet. If it ends in my hands amen, I'll be right: someone really uses zimbra. But if it ends up in the hands of someone else that the company uses zimbra, we're talking about a different issue.
Same thing for those who use feed readers.
Things like that don't give you security. Not only are you giving me the feed you use, but you also gave me the unique token. And this served to NOT be tracked? And yet, every time I write that I read the webserver logs, users who come from feedly increase. Does not work. I repeat: use a good browser. The same applies to these gentlemen here:
The same applies to those who have created discussion groups or use comments: what happens is that most commenting systems add IDs and tokens to requests, from which (simply by clicking on the URL) it is possible to find the origin and the author of the post.
The same applies to Forumfree, which considers that I should know from which precise message the click starts:
Even some additional browser features place their beautiful unique ID:
Ah, right: then there are those who are anonymous "because they come from the search engine". Quite right.
Hi, tovarish! How much time!
Bingo !!! Other id unique!
Moral of the story: use the fucking browser. If you really want to prevent your IP from being tracked use tor-browser, but all these tricks make you laugh. If the purpose was not to be tracked, all these unique IDs allow you to identify yourself enough that I could write (if I had time and desire) a rule that selectively blocks you.
Moreover, there is such a variety of useragents:
That in any case it is always possible to combine "Monte Compatri" or "Lanzo d'Intelvi" and the specific browser version for Huawei, in the specific release.
Ultimately, therefore, my advice is very simple: use a fucking browser. If you try to anonymize using that stuff you're using, or using a cell phone, you're failing.
The growth of URLs that contain unique tokens, thus able to identify the individual user (does not mean that I know who you are, means that I can distinguish you from anyone else!) Every time I mention the logs it makes me feel the need to tell you a simple thing .
If you don't know what you are doing, it is better to do what everyone else does: download Firefox and use the one with a normal bookmark. It is the situation in which you release less information.
As I said, for the rest I do NOT have a real data mining infrastructure, so I enjoy exploring the logs, but nothing more.
The trouble comes when you do these things with other sites that may have less scruples, or with Facebook, or with Google.
That said, I hope I have clarified the anxieties you have. If so, you can always use Tor Browser: https://www.torproject.org/download/
That at least a minimum, I say a minimum, really anonymizes you.
What I wrote is nothing new, and any junior systems expert who is familiar with the apache logs knows this.
But as far as I can see, everyone else has a reaction, when I only mention the logs, which worries me.
The remedies you find ARE THE WORST OF THE BAD.
Moreover, even the funny guy who has raised my reading to half a million a day yesterday (+ 300% approximately) is getting a hair dry, so he is asked to stop: I am perfectly able to distinguish page hits from visitors, and I know how many hits per page are usually.