I'm losing the SEO battle for my own open source project

Daily Digest email

Get the top HN stories in your inbox every day.

Growtika

A couple years back John Reilly posted on HN "How I ruined my SEO" and I helped him fix it for free. He wrote about the whole thing here: https://johnnyreilly.com/how-we-fixed-my-seo

Happy to do the same for you if you want.

The quickest win in your case: map all the backlinks the .net site got (happy to pull this for you), then email every publication that linked to it. "Hey, you covered NanoClaw but linked to a fake site, here's the real one." You'd be surprised how many will actually swap the link. That alone could flip things.

Beyond that there's some technical SEO stuff on nanoclaw.dev that would help - structured data, schema, signals for search engines and LLMs. Happy to walk you through it.

update: ok this is getting more traction than I expected so let me give some practical stuff.

1. Google Search Console - did you add and verify nanoclaw.dev there? If not, do it now and submit your sitemap. Basic but critical.

2. I checked the fake site and it actually doesn't have that many backlinks, so the situation is more winnable than it looks.

3. Your GitHub repo has tons of high quality backlinks which is great. Outreach to those places, tell the story. I'm sure a few will add a link to your actual site. That alone makes you way more resilient to fakers going forward. This is only happening because everything is so new. Here's a list with all the backlinks pointing to your repo:

https://docs.google.com/spreadsheets/d/1bBrYsppQuVrktL1lPfNm...

4. Open social profiles for the project - Twitter/X, LinkedIn page if you want. This helps search engines build a knowledge graph around NanoClaw. Then add Organization and sameAs schema markup to nanoclaw.dev connecting all the dots (your site, the GitHub repo, the social profiles). This is how you tell Google "these all belong to the same entity."

5. One more thing - you had a chance to link to nanoclaw.dev from this HN thread but you linked to your tweet instead. Totally get it, but a strong link from a front page HN post with all this traffic and engagement would do real work for your site's authority. If it's not crossing any rule (specific use case here so maybe check with the mods haha) drop a comment here with a link to nanoclaw.dev. I don't think anyone here would mind if it will get you few steps closer towards winning that fake site

adamtaylor_13

This is very generous of you!

If I was the author, however, I'd still feel like I've been put in a predicament where I need to spend personal agency to fix something that Google has broken.

While that may just be a fact of life, my internal injustice-o-meter would be raging. Like, Google is going to take hours of my life because they, with all their billions of capital, can't figure out the canonically-true website when it's RIGHT THERE in the GitHub repository?

Ugh. I guess that's just the day we live in. But it makes me rage against the machine on the author's behalf.

MerrimanInd

I had the exact same thought while reading the above comment, as helpful and generous as it is. Google's entire business model is to help people find things on the internet. They're an insanely well resourced company with all kinds of smart programmers. They have a moral and financial incentive to direct people to canonical sources of information. And STILL it's on this open-source dev to do all the steps outlined just to get the situation corrected?

pocksuppet

Google's business model is to help Google's customers pay money to Google. Google Search's customers are mostly scammers who run adverts. Helping the user find a thing is at odds with helping the user find a scam that pays Google money.

allthetime

The billions of capital are exactly why they don't care about you. Also, Google didn't break anything. The only person who can claw out a place in this giant machine for yourself is you - all while billions of others attempt to do the same.

sam1r

I can’t be the only one blasting killing in the name of in my noise canceling headphones the moment I read your comment..

yieldcrv

Author already is spending personal agency

So the feeling is fine, and if he’s going to bother at all, which he is, he should be doing it efficiently. Everything so far was panic and inefficiency

gowld

How many Google search results would point to OP's site?

If Google didn't exist, how many Google search results would point to OP's site?

undefined

[deleted]

input_sh

> This is very generous of you!

No it's not, it's a sales pitch that intentionally ignores some of the things pointed out in the article. The author has invested time into proper SEO optimization, legit websites already link to it et cetera, it's all explained in the article.

From the perspective of a spammer: They need like 2 million MAU to earn below minimum wage. You're never getting those figures by doing something legit and actually useful to a tiny subset of people. You either need a vague site beyond any point of usefulness to anyone or you need a network of knockoff sites. The reason you can't compete with these shitty SEO spam version of your site is because they already have a network of "authoritative" (in Google's eyes) sites and all they have to do is to link from them to a new one to expand their shitty network.

From the perspective of SEO agencies: They can't guarantee results. They can tell you vague, easily-googleable best practices and give you an output of some SEO SaaS that's far too expensive for an individual to purchase. Ahrefs(.com) is the prime example of this, the cheapest paid version costs $129/month. Do you care about SEO that much? No, so you go to these agencies and give them money for them to give you the output of such a tool. But that SaaS also only contains vague and nebulous "things to fix" to follow "best practices" because they also cannot know what drives traffic to your competitor from the outside perspective.

My best suggestion would be to start a website from day one. Doesn't matter how good the website is at first, Google favours sites that exist for longer. If you're creating a website after the knock-off version already exist, you might as well give up immediately, it's gonna be near impossible to recover from that.

adamtaylor_13

> No it's not, it's a sales pitch that intentionally ignores some of the things pointed out in the article.

Sales pitch or not, someone offering their time to help me with a problem is feels generous to me. To each their own, I suppose.

But again, you reinforce my point in your last sentence. Now anytime I want to make any little toy project (because how can anyone know when their toy project will blow up overnight?) I have to make a full blown website just to ensure I don't get SEO-spammed into oblivion?

My point still stands. Google is the problem and while we likely can't effectively do anything about it, it's frustrating as hell.

baxtr

Helping each other out is human isn't it?

RyanOD

Lame to have to do all this pointless busy work just to "win" the SEO battle.

danny8000

If Nanoclaw generates some revenue, you should trademark the name and also buy nanoclaw.com. Move the site to the .com domain and then do the steps above. All things being equal, ".com" TLD should get you higher page rank than your existing ".dev". Google is ranking ".net" fake page higher than ".dev". If your page wasn't on .dev TDL it might be second already.

jongjong

All this work to solve one website's problem... You can be sure MANY other open source projects are facing the same issue. It's just not a viable solution. There is something wrong with Google. Google has to fix it.

mlrtime

For someone that kinda gets SEO this was a bit enlightening, thank you. I have nothing for myself to use this information but helps me understand.

eviks

> Google Search Console - did you add and verify nanoclaw.dev there?

Did you read the post before promoting yourself?

> Submitted to Google Search Console probably 15 times.

> map all the backlinks the .net site got (happy to pull this for you), then email every publication that linked to it.

The links are already correct:

> NanoClaw got covered in The Register, VentureBeat, The New Stack, all linking to the real site.

graeme

Fantastic advice

prohobo

"Open social profiles for the project" haha if only it were that easy...

AznHisoka

I’m looking at this from a 3rd party of view (definitely not claiming the .net “deserves” to rank higher)

1) the .net version has a couple of very high authority links, namely from theregister and thenewstack (both of which have had lots of engagement).

I highly doubt it would have ranked without those links.

2) its only been a week. Give Google time to understand which pages should rank higher.

3) Google is biased towards sites that cover a topic earlier than others.

I’ve seen pages that are still top 3 for a particular competitive query years later, simply because they were one of the first to write about it.

Suggestions: give it time. Meanwhile I would recommend linking to your website rather than your github everywhere you mention it, to give it a boost

niam

If it saves anyone else the effort: I went to doublecheck the claim that those articles cited the wrong page, and it seems you're correct on The Register, but archive.org's earliest copies of the other two articles don't seem to reference the impostor site. They refer instead to the GitHub.

https://web.archive.org/web/20260301133636/https://www.there... https://web.archive.org/web/20260211162657/https://venturebe... https://web.archive.org/web/20260220201539/https://thenewsta...

phkahler

>> I’ve seen pages that are still top 3 for a particular competitive query years later, simply because they were one of the first to write about it.

With so many copycats on the internet, first to publish seems like a fairly good indication of the original source. But as we can see here, that's not always true.

Calzifer

> 3) Google is biased towards sites that cover a topic earlier than others.

> I’ve seen pages that are still top 3 for a particular competitive query years later, simply because they were one of the first to write about it.

Reason why I still always get the Java 8 docs for any search. Annoying.

wavemode

I think the real reason for that is simply that a lot of people are still running Java 8 (so those docs still see a lot of traffic). I remember reading that it's still used by something like 25% of Java developers.

tyingq

Most of the problem is the "only been a week" part, likely. Though you're fighting an algorithm that's been patched in inconsistent places for all sorts of weights like "authority" and "quality".

Thousands of little weights driven by obscure attributes of the site that you're not really going to figure out by thrashing and changing stuff.

graemep

I think the precaution developers should take is having a website and adding a page to it for each project.

If you must just have a repo self host it. In fact, selfhost the repo in any case.

uyzstvqs

I did some experimenting using different search engines and AIs. Here's the results:

Google and Brave linked to the official GitHub repo followed by the fake domain. DuckDuckGo and Bing linked to the fake domain first, followed by the official GitHub. Mojeek gave higher ranking to two third party articles, but linked to both the official GitHub and website without fakes. Qwant was the worst, as the official website was the second result amongst multiple fake websites and an unrelated GitHub repo.

Then there the AIs. ChatGPT, Google AI mode, Gemini, Grok, Perplexity, and Brave Search "Ask" all linked to the official website, and some added the GitHub repo as well. DuckDuckGo Search Assist linked to just the official GitHub. Google AI mode, Gemini and Grok also explicitly warned about the fake websites. Copilot got the official website and GitHub right, but linked to a presumably fake X account as well.

Conclusion: Google, Brave and Mojeek win in search. AI is very good and clearly beats search overall. Google AI mode, Gemini and Grok stand out in quality.

spyder

For you... But the results are different for different users.

For me Google shows the .net site first the github one as second.

Asking chatgpt 5.2 (Auto mode) to search for the nanoclaw site, it says the same, first links the .net site and shows the github as an optional page. When I try to give it a hint by asking "are you sure?" it still even hallucinates that it's linked from the github:

"Yes — nanoclaw.net is the official documentation/site for the NanoClaw project, in the sense that it’s the project’s published homepage and is directly linked from its canonical open-source repository. It describes the project, features, installation steps, and links to the source code on GitHub, which is the authoritative source for the project’s codebase."

Chatgpt 5.2 (Thinking mode) and Claude gets it right the first try, they asnwer with the official .dev page first and claude shows the .net second as "another site covering the project".

andai

I was surprised by what you said, so I used a browser that's not logged in to a Google account, to compare. Indeed the fake site ranks #1! Dang!

I guess Google has my account in an autism bucket, so biases GitHub links higher ;)

undefined

[deleted]

1kurac

I tried AltPower Search and it exhibits the same issue as Google. I think you might just need to give it more time to index. Nanoclaw.dev has only been available for a week. Then, it's the lower relative reputation of the 'dev' vs. the 'net' domain ...

[1]: https://altpower.app [2]: https://web.archive.org/web/20260000000000*/https://nanoclaw... [3]: https://radar.cloudflare.com/tlds

sghitbyabazooka

this thing is just google with a theme

Marsymars

How did you prompt the AIs?

markus_zhang

My advice to all OSS developers: if you open source your project, expect it to be abused in all possible ways. Don't open source if you have anxiety over it. It is how the world works, whether we like it or not.

I appreciate that you open source your projects for us to study. But TBH, please help yourself first.

pocksuppet

In particular, if you license it MIT, and it's useful, expect Amazon to make a fork, not give you the source code, and each tens of millions of dollars from it while you don't get a cent.

There's writing code for charity, and then there's this. Charity wasn't meant to include hyper-corporations.

nananana9

If you want evil megacorps to give you money when they use your thing, maybe say "if you're an evil megacorp you have to give me money when you use my thing" in the license?

If your license reads "hey, you can use this however you want, no matter who you are, and don't have to give me money", people will use it however they want, no matter who they are, and won't give you money.

Unfortunately, for decades, free software fanatics have bullied inexperienced and eager programmers, who don't know any better into believing that an actual sustainable development model that respects their work is evil and that we should all work for free and beg for donations.

pocksuppet

Exactly. (A)GPL tried to balance this, in ways that still partially work. MIT software just throws its hands up and donates itself to evil megacorps in anger. If you believe in charity for ordinary people but not for evil megacorps, you'll put something in the license that evil megacorps don't like, or that forces them to work for the benefit of everyone by releasing their work that builds on yours.

You can't write "Amazon may not use this" and still be free as in freedom, but terms that force sharing seem to work.

gorjusborg

> free software fanatics have bullied and eager programmers

We must travel in different circles. I've been around a while, and I've never seen _any individual_ bullied for keeping their code closed source.

That said, I have an extreme bias toward only using open source code, for practical reasons, and I'm open about that.

sfRattan

> Unfortunately, for decades, free software fanatics have bullied inexperienced and eager programmers, who don't know any better into believing that an actual sustainable development model that respects their work is evil and that we should all work for free and beg for donations.

Silicon Valley hype monsters have done this, sure. And so have too many open source software advocates. But all the free software advocates I've read and listened to over the years have criticized MIT- and BSD-style permissive licenses for permitting exactly the freeloading you describe.

markus_zhang

What if they simply use the code and don't give you the $$$? Are you going to sue them?

shevy-java

I agree that MIT may not be the best licence here in such a use case scenario. The question is why corporations think they can be leeches though - and the bigger, the more of a leech they are on the ecosystem. That's just not right.

vablings

The idea that software that is free NEEDS to be open source because "I don't want something running on my computer" but then will go and download the precompiled binary hurts my head alot

RcouF1uZ4gsC

With the cloud, GPL won’t protect you either

atls

AGPLv3 attempts to solve this problem, by forcing SaaS providers to open-source their modifications.

https://www.gnu.org/licenses/agpl-3.0.en.html

j1elo

Depends on the needs of the licensor. AGPLv3 solves the problem of other players taking the code, improving it privately, and not sharing those improvements. But AGPLv3 is not a silver bullet for people who write Open Source code and pretend to make a living from it. "Open Source is not a business plan".

https://news.ycombinator.com/item?id=45095581

smegger001

> if you license it MIT, and it's useful, expect Amazon to make a fork, not give you the source code,

thats why the gpl family of license exist.

MIT/BSD family licenses are do whatever you want with this,

if you want to make money off of you pet opensource project I recommend multi-license it with a copyleft with copyright assignment required for contributions and offer other licenses with a fee.

Andrex

Maybe Stallman had something of a point...

RcouF1uZ4gsC

Nope. Stallman helped create this mess.

Free software underpins all the infrastructure of surveillance capitalism.

ekjhgkejhgk

Stallman is always right, and HN always downvotes it.

frizlab

And whatever license you use, expect it to be crawled by AI, and have AI provider make millions on it.

gowld

So? I am not about to create AWS. I'm glad people can use my free software on their own machines, on rented servers, or hosted by an expert.

alpaca128

AWS can profit more from it than smaller organizations or individuals, making it even more untouchable by potential competition.

A market with little competition costs you too in the long term.

Ma8ee

Are you still glad when AWS starts selling you software as a service and make hundreds of millions and you get nothing?

mkehrt

I don't understand your point? If you write code with an MIT license, this is what you would expect.

shevy-java

Totally agreed.

I find it strange that people use the MIT licence and then complain "big greedy corporation did not contribute back anything". Though I also agree that this leeching approach by corporations is a problem to the ecosystem. MIT just is not the right licence to fight that.

pocksuppet

People are conditioned not to think about it.

pfrrp

There is even a software "law" related to this: https://www.hyrumslaw.com/

" With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody. "

ariehkovler

It's worse than that. There's a SECOND imitator that I actually stumbled on today while looking something up about nanoclaw - nanoclawS [dot] io - and that one's harvesting email addresses.

The obvious risk here is a bait and switch, where one of these sites switches their link to the Github repo to point to a malicious imitator repo instead.

One approach would be to go after the sites themselves, not their Google ranking. See if their hosts are willing to take them down. Is there anything you can assert copyright over to hang a DCMA request on? That's hard for an Open Source project, I guess. And the fake sites aren't (yet) doing any actual scamming.

Good luck, though!

yorwba

The article says "Filed takedown notices with Google, Cloudflare, and the domain registrar spaceship.com"

ariehkovler

Yeah but you do need to hang the takedown on some technical reason like copyright or scamming. The issue here is there's no obvious victim. Makes a takedown harder.

mx7zysuj4xew

Since the clone site isn't doing anything obviously malicious like spreading malware or blatantly illegal content none of those parties will take any action whatsoever, nor should they.

jacquesm

It isn't doing that now, but you can't be sure about what they're going to be up to a little ways down the line, the fact that they are clearly trying to misdirect the traffic is proof positive they're up to no good.

Just do a bit of risk assessment if something like this were to be shipped to people that have come to blindly trust the source and you'll see why letting this slip is a very bad idea.

pocksuppet

Most registrars and hosts consider phishing already malicious, even if there's no obvious malware download or anything.

james_marks

*yet

Build the audience first, attack comes later

bob1029

Losing the SEO battle is a lot like losing money on the stock market. The system you are fighting is incredibly efficient and will never in a trillion years give a single shit about your specific concerns. You can hire lawyers and spend time complaining about it all day on social media. But you'll rarely get a drop of blood out of this stone. The best you can do is to step back, reevaluate your understanding of the market, and adjust your strategy.

Vegenoid

I pay for Kagi to get better search results. Lately, I’ve felt that Kagi’s search has been just as full of low-information and AI generated results as Google. I’ve been wondering why I’m still paying for it. This seemed like a good litmus test. Unfortunately, Kagi displays pretty much the same results as Google for nanoclaw.

soiltype

Yeah that's increasingly been my feeling as well. I have to keep prefacing my Kagi recommendations with, "web search is less and less useful every year, but..."

I still appreciate being able to customize rankings, bangs, and redirects. But with how utterly shit the web is overall, any web search is basically only good if you know the site(s) the answer(s) will be on. When you're searching for something novel-to-you, even Kagi is just going to show you a full page of unregulated slop on the dumbest, just-registered-this-year domains. Real information is increasingly limited to small islands of trust.

duskdozer

Isn't Kagi basically just using a blocklist? In which case it's whack a mole as new sites spring up or bubble up to the top of other results. I keep my own blocklist and intermittently search key phrases to blanket block new sites, and there's often new sites popping up.

Vegenoid

This sounds very interesting, could you elaborate on your methods and tools?

duskdozer

https://github.com/iorate/ublacklist

Basically,

- whenever a spam site is in my results, click to block - click through some and check out pages like "About" "Terms" "Disclaimer" "Privacy". often common boilerplate phrases abound. affiliate marketing spam, and similar, for example - exact string search those phrases - (get often many pages of results) - using a userscript, block all domains in each result page

my results have gotten so much better for common types of queries. even then, new sites pop up almost daily it seems. add to the list.

duxup

I don’t like any search engines now :(

CSSer

It's because the search engine is being eaten by the LLM. I'm not suggesting that it's a perfect substitute. It's just what I feel is happening.

TSiege

more like LLM garbage are rotting search engines from the inside out

duxup

Naw this is a pre LLM problem.

frereubu

I hadn't really noticed anything like this until you pointed it out. My main use for Kagi is to pin Wikipedia results... I just tried searching for "nanoclaw" on Kagi (I'm in the UK so results biased towards there) and got:

1. nanoclaw[dot]net (!)

2. github.com/qwibitai/nanoclaw which looks like a ripoff?

3. Three videos, at least one of which looks like slop with crypto ads

4. github.com/gavrielc/nanoclaw which I presume is the real repo judging from the name?

5. Three "interesting finds" the top one of which is nanoclaw.dev, but with the title "Don't trust AI agents" because it's a blog post from that site

6. A fork of the qwibitai/nanoclaw repo

bigiain

> 2. github.com/qwibitai/nanoclaw which looks like a ripoff?

That is literally the GitHub repo the original article shows as being "real".

allthetime

Piggybacking on the Claw hype, surprised when someone piggybacks on you...

stusmall

Especially when the original claw had to change its name because it was piggybacking on another products hype...

Imustaskforhelp

https://en.wikipedia.org/wiki/Turtles_all_the_way_down

ajross

That was exactly my first thought. The better framing here isn't "honest site victimized by Google linking to their IP-thieving scammer clone", it's "dude lost in an arms race of eyeball chasing and is salty about it".

GeoAtreides

And I'm losing the sanity battle for my own mind with all these AI generated posts pls I beg you two lines by your hand are worth 100000 generated tokens

MarkSweep

The link on GitHub to the real site is marked with rel="nofollow". I wonder if it would make sense for GitHub to remove nofollow in some circumstances. Perhaps based on some sort of reputation system or if the site links back to the repo with a <link rel="self" href="..." /> in the header? Presumably that would help the real site rank higher when the repo ranks highly.

geocar

I don't see any reason that GitHub should use rel="nofollow"

Github only has authority because people put their shit there; if people want to point that back at the "right" website, Github should be helping facilitate that, instead of trying to help Google make their dogshit search index any better.

I mean, seriously, doesn't Bing own Github anyway?

pocksuppet

Perverse incentives strike again! Websites that allow links in user-generated content are spammed with user-generated spam links to improve SEO of spam sites, which hurts the site's own reputation because most of the links on it are spam. To avoid this, all sites use nofollow.

geocar

As this example shows, by all sites using nofollow, Github is improving the SEO of spam sites.

What the fuck are you talking about?

Sweepi

> When you Google "NanoClaw," a fake website ranks #2 globally, right below the project's GitHub.

Unfortunately, the fake website [.net] is also #3 on Kagi, and #1 on Duckduckgo. On Kagi, the Github is #1 and nanoclaw.dev is #4, but only if you count "Interesting Finds". On Duckduckgo, the Github is #2 and nanoclaw.dev is nowhere to be found.

tracker1

Do what Louis Rossman did... just ask Google's AI what you need to change on your site... Apparently that's the secret now.

Daily Digest email

Get the top HN stories in your inbox every day.