Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

alufers

Wanted to show off my little project which helps whith reverse engneering APIs used by various apps. It takes HTTP traffic capturewd by mitmproxy and generates an OpenAPI specification for a given REST API.

I have used it already on two apps and the results are good enough to write an alternative client or quickly automate some stuff.

mhils

mitmproxy dev here, very awesome! :) This seems to be particularly useful to quickly generate clients for reverse-engineered APIs.

mohsen1

Swagger Editor dev which now works at Airbnb here. This is hilarious!

SOLAR_FIELDS

Hilarious indeed! The first thing I thought of with this project is actually AirBnB, because the sort/filter/map view is so terrible and missing features. AirBnB captures data on a bunch of stuff, but doesn't make it possible to search for in the UI (ever want a property with a lake view or a sauna? AirBnB knows which ones have those things, but they won't let you look for them!)

AirBnB doesn't have an official API but changes the tags so often that scrapers people put up on Github go out of date quickly. Now I can run this whenever I want to have actual search functionality (instead of the hobbled crap available on the website) and ensure that whatever flavor of API is available on the website that day is easily queryable!

anitil

What a fantastic idea! I have so many half baked things that some idiot (me) built without documenting the underlying API. This will make life so much easier

lancebeet

This is a really clever project. It seems like an obvious idea once you've seen it, but it clearly isn't. Thank you for sharing it.

ludovicianul

This is great :) You can then fuzz your APIs for issues using https://github.com/Endava/cats.

upupandup

does it capture route/server rendered pages too?

alufers

It does, but it will only generate schema descriptions for JSON endpoints. Whis means that the URL and method will appear in the spec, but not the response/request schema.

nickysielicki

This is really incredible. With a rooted android phone and these tools, plus a couple others [1,2,3], you can get a skeleton to implement a backend for any app you want.

[1]: https://github.com/koxudaxi/fastapi-code-generator

[2]: https://github.com/ioxiocom/openapi-to-fastapi

[3]: https://infosecwriteups.com/hail-frida-the-universal-ssl-pin...

andreidd

That's interesting, but it won't work with native code that statically links a SSL implementation.

jeroenhd

In many applications you can bypass built-in verifications with some Frida [1] code. It requires more effort to do so, of course, as you'd need to find the OpenSSL methods (with a script like this [2] and bypass the verification in there.

If you're really intent on getting it to work, downloading the binary, patching out the verification function and putting it back is also possible if you're root.

[1]: https://frida.re/docs/android/

[2]: https://mobsecguys.medium.com/exploring-native-functions-wit...

undefined

[deleted]

SemanticStrengh

Can this be used to generate a REST documentation for your own frontend just by interacting with it? This should be augmented via a crawler, that click everyclickable element recursively.

alufers

Totally, but you would need to do some manual cleanup and naming afterwards to make it more useful than just reading the source code. You could also for example use your integration tests if you have some to capture as much routes as possible.

SemanticStrengh

of course the generated doc should be refined (e.g. filling missing types, error codes) but your lib would save us a lot of work and make the world a better place.

tomatowurst

"...and we expect it to be free and open source as our budget for this is zero."

Labo333

Very nice!

On the same note, I wrote a program to generate Python code (requests) from a HAR capture: https://github.com/louisabraham/har2requests

I think using HAR captures is simpler for the end user than spawning mitmproxy as they don't require any installation and are extracted from the network tab of the browser devtools. Is there a reason why you didn't use them?

EDIT: I realized that mitmproxy can also get traffic from other devices like phones. Very cool project, I will think about modifying mine to support mitmproxy captures!

alufers

Hey! Just writing to let you know that I've added HAR input support to mitmproxy2swagger.

Labo333

Wow that's super cool! Thanks!

olabyne

Oh, I used a python script to generate pre-made requests from HAR recently, I'm pretty sure it was your git ! Very useful :)

Labo333

Thanks!

captn3m0

Almost exactly a fit against my idea[1] to generate OpenAPI from HAR files. Going to read through to see if I can add HAR support.

[1]: https://github.com/captn3m0/ideas#openapi-specification-gene...

efitz

OpenAPI is just the latest version of swagger. Should not be hard to change.

I was able to translate HAR to OpenAPI with this web site's free preview: https://www.apimatic.io/transformer/

I also see others are working on the same thing: https://github.com/dcarr178/har2openapi

alufers

Hey! Just writing to let you know that I've added HAR input support to mitmproxy2swagger.

jeroenhd

Very interesting! Would this also be able to determine what kind of auth (header tokens, cookies, etc) the APIs require or is that something you still need to detect manually?

alufers

At this point yes, but I am working on adding this.

upupandup

this is absolutely insane!!! I understand capturing the REST api network part, is it then examining the request body, headers being sent back and forth to figure out the API?

alufers

Yes, this is basically what this program does.

ninkendo

From what I understand it’s also somewhat how JIT works in various JavaScript engines: observe the sorts of objects (which naively have the performance characteristics of hash tables) you see, and start defining static offsets for fields you observed. The JIT’d (fast) objects may morph over time as new fields are observed, but I’d imagine it’s a similar idea to creating documentation… “this object tends to have these fields, so just pretend those are the only fields it can have, until another request proves otherwise”, with similar guess/checking for their types/etc.

aleksiy123

Really awesome, I tried my hand at writing something similar and was surprised at how well it actually ended up working.

I feel liken the next step is automatically generating load tests and/or fuzzing tests. Felt like that could be a real product.

eligro91

Really amazing.

We're having hundreds of undocumented endpoints created over the years, and running this tool on our backends will create instantly good documentation

Thanks for that! Will give feedbacks if any issues

POPOSYS

Can we have this as a browser dev tool please? F12 -> Tab REST -> Create spec from API

Divyeshkharade

This looks amazing. Will it also capture data types like enumerators by someway detecting patters?

alufers

I thought about it, but it would be hard to distinguish between an enumerator and just static data. For example if you logged in with only one account it could classify the "username" field as an enumeration, because there is only one captured value.

freedomben

Yeah I imagine that is nearly impossible without capturing data at scale. Awesome tool! I'm super grateful :-)

efitz

This is awesome; I’m going to try it as soon as I get back to my desk. I’ve been working on trying to glue together tools to translate Charles proxy output to OpenAPI (swagger). I think it would be a great tool to have in a web app reverse engineering toolbox.

Daily Digest email

Get the top HN stories in your inbox every day.

Show HN: Mitmproxy2swagger – Automagically reverse-engineer REST APIs - Hacker News