38 Commits

Author SHA1 Message Date
Kevin Pham
cb409f96d4 add package.json 2023-11-26 17:48:11 -06:00
Kevin Pham
d71ebe5137 test demo 2023-11-26 16:44:04 -06:00
Kevin Pham
6c54d31086 add dynamic resource url patcher as standalone responsemodifier 2023-11-23 08:14:52 -06:00
Kevin Pham
5d55a2f3f0 refactor rewriters to modify html in single pass with multiple rewriters; improve html rewriter edge case handling 2023-11-22 23:51:52 -06:00
Kevin Pham
7668713b1a fix recursive proxy calls 2023-11-22 07:21:48 -06:00
Kevin Pham
bfd647e526 url rewrite improvements 2023-11-21 23:25:34 -06:00
Kevin Pham
efa43a6f36 minor fix with parameterization 2023-11-21 20:58:16 -06:00
Kevin Pham
854dafbcfa improve js rewriting functionality by not relying on js to get proxy and proxified URLs; direct injection from golang 2023-11-21 20:54:09 -06:00
Kevin Pham
a4e016b36c add common referrer options 2023-11-21 20:33:52 -06:00
Kevin Pham
0e620e46ab organize rewriters 2023-11-21 18:44:33 -06:00
Kevin Pham
0fc0942095 encorporate url encoding issue fix from ddba232a31 2023-11-21 15:09:24 -06:00
Kevin Pham
dab77d786f url rewriter tweaks 2023-11-21 14:10:37 -06:00
Kevin Pham
543192afbe support js URL rewriting; support post req 2023-11-21 10:45:29 -06:00
Kevin Pham
79a229f28c handle srcset resource URL rewrites; monkey patch JS for URL rewrites 2023-11-20 23:42:50 -06:00
Kevin Pham
6222476684 forward content-type headers 2023-11-20 15:49:39 -06:00
Kevin Pham
5d46adc486 wip 2023-11-20 15:37:07 -06:00
Kevin Pham
1d88f14de2 rewrite resource URLs based on html tokenizer instead of regex 2023-11-20 11:38:53 -06:00
Kevin Pham
5035f65d6b wip 2023-11-19 20:59:55 -06:00
Kevin Pham
ee9066dedb refactor wip 2023-11-19 15:03:11 -06:00
Kevin Pham
98fa53287b fix nil pointer deref 2023-11-18 18:13:46 -06:00
Kevin Pham
f6341f2c3e begin refactor of proxy engine 2023-11-18 08:31:59 -06:00
Gianni Carafa
6d8e943df5 add env var docker run command 2023-11-16 14:14:17 +01:00
Gianni Carafa
68e5023ed9 Revert "remove rulesets from base repository"
This reverts commit 8d00e29c43.
2023-11-16 14:01:57 +01:00
Gianni Carafa
8d00e29c43 remove rulesets from base repository 2023-11-16 13:30:23 +01:00
Gianni Carafa
c8d39ea21f readd ruleset 2023-11-16 13:27:57 +01:00
Gianni Carafa
dae4afb55e fix typo 2023-11-16 13:10:55 +01:00
mms-gianni
a83503170e Merge pull request #41 from deoxykev/refactor_rulesets
refactor rulesets into separate files and add a ruleset compiler cli …
2023-11-16 13:07:11 +01:00
Kevin Pham
0eef3e5808 refactor rulesets into separate files and add a ruleset compiler cli flag 2023-11-15 15:30:23 -06:00
Gianni Carafa
7597ea2807 udpate README 2023-11-15 21:28:23 +01:00
Gianni Carafa
235dca8dd0 minor ruleset improvements 2023-11-15 21:04:42 +01:00
mms-gianni
191279c00c Merge pull request #40 from everywall/39-request-header-fields-too-large
fix request header fields to large
2023-11-15 20:46:09 +01:00
mms-gianni
f4060c3e78 Merge branch 'main' into 39-request-header-fields-too-large 2023-11-15 20:45:59 +01:00
mms-gianni
55284f0b24 Merge pull request #37 from deoxykev/organized_rulesets
Organized rulesets
2023-11-15 20:45:10 +01:00
mms-gianni
f7f4586032 Merge branch 'main' into organized_rulesets 2023-11-15 20:40:36 +01:00
Gianni Carafa
fe881ca661 use cookie method to empty cookie header 2023-11-15 16:48:00 +01:00
Gianni Carafa
86700d8828 set empty cookie 2023-11-15 16:34:56 +01:00
Kevin Pham
a8d920548c add feature to load ruleset from directory or gzip file on http server, refactor ruleset loading logic 2023-11-14 15:57:39 -06:00
Kevin Pham
e87d19d7f5 add ability to load rulesets from directory 2023-11-14 15:42:26 -06:00
71 changed files with 3309 additions and 367 deletions

View File

@@ -22,11 +22,7 @@ jobs:
- -
name: Set version name: Set version
run: | run: |
VERSION=$(git describe --tags --abbrev=0) echo -n $(git describe --tags --abbrev=0) > handlers/VERSION
echo -n $VERSION > handlers/VERSION
sed -i 's\VERSION\${VERSION}\g' handlers/form.html
echo handlers/form.html >> .gitignore
echo .gitignore >> .gitignore
- -
name: Set up Go name: Set up Go
uses: actions/setup-go@v3 uses: actions/setup-go@v3

View File

@@ -42,11 +42,7 @@ jobs:
- name: Set version - name: Set version
id: version id: version
run: | run: |
VERSION=$(git describe --tags --abbrev=0) echo ${GITHUB_REF#refs/tags/v} > handlers/VERSION
echo -n $VERSION > handlers/VERSION
sed -i 's\VERSION\${VERSION}\g' handlers/form.html
echo handlers/form.html >> .gitignore
echo .gitignore >> .gitignore
# Install the cosign tool except on PR # Install the cosign tool except on PR
# https://github.com/sigstore/cosign-installer # https://github.com/sigstore/cosign-installer

View File

@@ -48,12 +48,12 @@ Certain sites may display missing images or encounter formatting issues. This ca
### Binary ### Binary
1) Download binary [here](https://github.com/everywall/ladder/releases/latest) 1) Download binary [here](https://github.com/everywall/ladder/releases/latest)
2) Unpack and run the binary `./ladder` 2) Unpack and run the binary `./ladder -r https://t.ly/14PSf`
3) Open Browser (Default: http://localhost:8080) 3) Open Browser (Default: http://localhost:8080)
### Docker ### Docker
```bash ```bash
docker run -p 8080:8080 -d --name ladder ghcr.io/everywall/ladder:latest docker run -p 8080:8080 -d --env RULESET=https://t.ly/14PSf --name ladder ghcr.io/everywall/ladder:latest
``` ```
### Docker Compose ### Docker Compose
@@ -106,7 +106,7 @@ http://localhost:8080/ruleset
| `LOG_URLS` | Log fetched URL's | `true` | | `LOG_URLS` | Log fetched URL's | `true` |
| `DISABLE_FORM` | Disables URL Form Frontpage | `false` | | `DISABLE_FORM` | Disables URL Form Frontpage | `false` |
| `FORM_PATH` | Path to custom Form HTML | `` | | `FORM_PATH` | Path to custom Form HTML | `` |
| `RULESET` | URL to a ruleset file | `https://raw.githubusercontent.com/everywall/ladder/main/ruleset.yaml` or `/path/to/my/rules.yaml` | | `RULESET` | Path or URL to a ruleset file, accepts local directories | `https://raw.githubusercontent.com/everywall/ladder-rules/main/ruleset.yaml` or `/path/to/my/rules.yaml` or `/path/to/my/rules/` |
| `EXPOSE_RULESET` | Make your Ruleset available to other ladders | `true` | | `EXPOSE_RULESET` | Make your Ruleset available to other ladders | `true` |
| `ALLOWED_DOMAINS` | Comma separated list of allowed domains. Empty = no limitations | `` | | `ALLOWED_DOMAINS` | Comma separated list of allowed domains. Empty = no limitations | `` |
| `ALLOWED_DOMAINS_RULESET` | Allow Domains from Ruleset. false = no limitations | `false` | | `ALLOWED_DOMAINS_RULESET` | Allow Domains from Ruleset. false = no limitations | `false` |
@@ -115,9 +115,10 @@ http://localhost:8080/ruleset
### Ruleset ### Ruleset
It is possible to apply custom rules to modify the response or the requested URL. This can be used to remove unwanted or modify elements from the page. The ruleset is a YAML file that contains a list of rules for each domain and is loaded on startup It is possible to apply custom rules to modify the response or the requested URL. This can be used to remove unwanted or modify elements from the page. The ruleset is a YAML file, a directory with YAML Files, or an URL to a YAML file that contains a list of rules for each domain. These rules are loaded on startup.
There is a basic ruleset available in a separate repository [ruleset.yaml](https://raw.githubusercontent.com/everywall/ladder-rules/main/ruleset.yaml). Feel free to add your own rules and create a pull request.
See in [ruleset.yaml](ruleset.yaml) for an example.
```yaml ```yaml
- domain: example.com # Includes all subdomains - domain: example.com # Includes all subdomains
@@ -176,7 +177,7 @@ See in [ruleset.yaml](ruleset.yaml) for an example.
To run a development server at http://localhost:8080: To run a development server at http://localhost:8080:
```bash ```bash
echo "DEV" > handler/VERSION echo "dev" > handlers/VERSION
RULESET="./ruleset.yaml" go run cmd/main.go RULESET="./ruleset.yaml" go run cmd/main.go
``` ```

View File

@@ -8,6 +8,7 @@ import (
"strings" "strings"
"ladder/handlers" "ladder/handlers"
"ladder/internal/cli"
"github.com/akamensky/argparse" "github.com/akamensky/argparse"
"github.com/gofiber/fiber/v2" "github.com/gofiber/fiber/v2"
@@ -17,6 +18,7 @@ import (
//go:embed favicon.ico //go:embed favicon.ico
var faviconData string var faviconData string
//go:embed styles.css //go:embed styles.css
var cssData embed.FS var cssData embed.FS
@@ -38,22 +40,59 @@ func main() {
Help: "This will spawn multiple processes listening", Help: "This will spawn multiple processes listening",
}) })
verbose := parser.Flag("v", "verbose", &argparse.Options{
Required: false,
Help: "Adds verbose logging",
})
// TODO: add version flag that reads from handers/VERSION
ruleset := parser.String("r", "ruleset", &argparse.Options{
Required: false,
Help: "File, Directory or URL to a ruleset.yaml. Overrides RULESET environment variable.",
})
mergeRulesets := parser.Flag("", "merge-rulesets", &argparse.Options{
Required: false,
Help: "Compiles a directory of yaml files into a single ruleset.yaml. Requires --ruleset arg.",
})
mergeRulesetsGzip := parser.Flag("", "merge-rulesets-gzip", &argparse.Options{
Required: false,
Help: "Compiles a directory of yaml files into a single ruleset.gz Requires --ruleset arg.",
})
mergeRulesetsOutput := parser.String("", "merge-rulesets-output", &argparse.Options{
Required: false,
Help: "Specify output file for --merge-rulesets and --merge-rulesets-gzip. Requires --ruleset and --merge-rulesets args.",
})
err := parser.Parse(os.Args) err := parser.Parse(os.Args)
if err != nil { if err != nil {
fmt.Print(parser.Usage(err)) fmt.Print(parser.Usage(err))
} }
// utility cli flag to compile ruleset directory into single ruleset.yaml
if *mergeRulesets || *mergeRulesetsGzip {
err = cli.HandleRulesetMerge(ruleset, mergeRulesets, mergeRulesetsGzip, mergeRulesetsOutput)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
os.Exit(0)
}
if os.Getenv("PREFORK") == "true" { if os.Getenv("PREFORK") == "true" {
*prefork = true *prefork = true
} }
app := fiber.New( app := fiber.New(
fiber.Config{ fiber.Config{
Prefork: *prefork, Prefork: *prefork,
GETOnly: true, GETOnly: false,
ReadBufferSize: 4096 * 4, // increase max header size
}, },
) )
// TODO: move to cmd/auth.go
userpass := os.Getenv("USERPASS") userpass := os.Getenv("USERPASS")
if userpass != "" { if userpass != "" {
userpass := strings.Split(userpass, ":") userpass := strings.Split(userpass, ":")
@@ -64,6 +103,7 @@ func main() {
})) }))
} }
// TODO: move to handlers/favicon.go
app.Use(favicon.New(favicon.Config{ app.Use(favicon.New(favicon.Config{
Data: []byte(faviconData), Data: []byte(faviconData),
URL: "/favicon.ico", URL: "/favicon.ico",
@@ -77,6 +117,8 @@ func main() {
} }
app.Get("/", handlers.Form) app.Get("/", handlers.Form)
// TODO: move this logic to handers/styles.go
app.Get("/styles.css", func(c *fiber.Ctx) error { app.Get("/styles.css", func(c *fiber.Ctx) error {
cssData, err := cssData.ReadFile("styles.css") cssData, err := cssData.ReadFile("styles.css")
if err != nil { if err != nil {
@@ -85,11 +127,18 @@ func main() {
c.Set("Content-Type", "text/css") c.Set("Content-Type", "text/css")
return c.Send(cssData) return c.Send(cssData)
}) })
app.Get("ruleset", handlers.Ruleset) app.Get("ruleset", handlers.Ruleset)
app.Get("raw/*", handlers.Raw) app.Get("raw/*", handlers.Raw)
app.Get("api/*", handlers.Api) app.Get("api/*", handlers.Api)
app.Get("/*", handlers.ProxySite)
proxyOpts := &handlers.ProxyOptions{
Verbose: *verbose,
RulesetPath: *ruleset,
}
app.Get("/*", handlers.NewProxySiteHandler(proxyOpts))
app.Post("/*", handlers.NewProxySiteHandler(proxyOpts))
log.Fatal(app.Listen(":" + *port)) log.Fatal(app.Listen(":" + *port))
} }

View File

@@ -9,10 +9,11 @@ services:
environment: environment:
- PORT=8080 - PORT=8080
- RULESET=/app/ruleset.yaml - RULESET=/app/ruleset.yaml
#- ALLOWED_DOMAINS=example.com,example.org
#- ALLOWED_DOMAINS_RULESET=false #- ALLOWED_DOMAINS_RULESET=false
#- EXPOSE_RULESET=true #- EXPOSE_RULESET=true
#- PREFORK=false #- PREFORK=false
#- DISABLE_FORM=fase #- DISABLE_FORM=false
#- FORM_PATH=/app/form.html #- FORM_PATH=/app/form.html
#- X_FORWARDED_FOR=66.249.66.1 #- X_FORWARDED_FOR=66.249.66.1
#- USER_AGENT=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) #- USER_AGENT=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

3
go.mod
View File

@@ -24,6 +24,7 @@ require (
github.com/valyala/bytebufferpool v1.0.0 // indirect github.com/valyala/bytebufferpool v1.0.0 // indirect
github.com/valyala/fasthttp v1.50.0 // indirect github.com/valyala/fasthttp v1.50.0 // indirect
github.com/valyala/tcplisten v1.0.0 // indirect github.com/valyala/tcplisten v1.0.0 // indirect
golang.org/x/net v0.18.0 // indirect golang.org/x/net v0.18.0
golang.org/x/sys v0.14.0 // indirect golang.org/x/sys v0.14.0 // indirect
golang.org/x/term v0.14.0
) )

2
go.sum
View File

@@ -68,6 +68,8 @@ golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9sn
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8= golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k= golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.7.0/go.mod h1:P32HKFT3hSsZrRxla30E9HqToFYAQPCMs/zFMBUFqPY= golang.org/x/term v0.7.0/go.mod h1:P32HKFT3hSsZrRxla30E9HqToFYAQPCMs/zFMBUFqPY=
golang.org/x/term v0.14.0 h1:LGK9IlZ8T9jvdy6cTdfKUCltatMFOehAQo9SRC46UQ8=
golang.org/x/term v0.14.0/go.mod h1:TySc+nGkYR6qt8km8wUhuFRTVSMIX3XPR58y2lC8vww=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=

View File

@@ -28,8 +28,8 @@
<footer class="mt-10 mx-4 text-center text-slate-600 dark:text-slate-400"> <footer class="mt-10 mx-4 text-center text-slate-600 dark:text-slate-400">
<p> <p>
Code Licensed Under GPL v3.0 | Code Licensed Under GPL v3.0 |
<a href="https://github.com/everywall/ladder" class="hover:text-blue-500 hover:underline underline-offset-2 transition-colors duration-300">Source</a> | <a href="https://github.com/everywall/ladder" class="hover:text-blue-500 hover:underline underline-offset-2 transition-colors duration-300">View Source</a> |
<a href="https://github.com/everywall/ladder/releases" class="hover:text-blue-500 hover:underline underline-offset-2 transition-colors duration-300">VERSION</a> <a href="https://github.com/everywall" class="hover:text-blue-500 hover:underline underline-offset-2 transition-colors duration-300">Everywall</a>
</p> </p>
</footer> </footer>
</div> </div>

View File

@@ -10,93 +10,71 @@ import (
"regexp" "regexp"
"strings" "strings"
"ladder/pkg/ruleset"
"ladder/proxychain"
rx "ladder/proxychain/requestmodifers"
tx "ladder/proxychain/responsemodifers"
"github.com/PuerkitoBio/goquery" "github.com/PuerkitoBio/goquery"
"github.com/gofiber/fiber/v2" "github.com/gofiber/fiber/v2"
"gopkg.in/yaml.v3"
) )
var ( var (
UserAgent = getenv("USER_AGENT", "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)") UserAgent = getenv("USER_AGENT", "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")
ForwardedFor = getenv("X_FORWARDED_FOR", "66.249.66.1") ForwardedFor = getenv("X_FORWARDED_FOR", "66.249.66.1")
rulesSet = loadRules() rulesSet = ruleset.NewRulesetFromEnv()
allowedDomains = strings.Split(os.Getenv("ALLOWED_DOMAINS"), ",") allowedDomains = []string{}
) )
// extracts a URL from the request ctx. If the URL in the request func init() {
// is a relative path, it reconstructs the full URL using the referer header. allowedDomains = strings.Split(os.Getenv("ALLOWED_DOMAINS"), ",")
func extractUrl(c *fiber.Ctx) (string, error) { if os.Getenv("ALLOWED_DOMAINS_RULESET") == "true" {
// try to extract url-encoded allowedDomains = append(allowedDomains, rulesSet.Domains()...)
reqUrl, err := url.QueryUnescape(c.Params("*"))
if err != nil {
// fallback
reqUrl = c.Params("*")
} }
}
// Extract the actual path from req ctx type ProxyOptions struct {
urlQuery, err := url.Parse(reqUrl) RulesetPath string
if err != nil { Verbose bool
return "", fmt.Errorf("error parsing request URL '%s': %v", reqUrl, err) }
func NewProxySiteHandler(opts *ProxyOptions) fiber.Handler {
/*
var rs ruleset.RuleSet
if opts.RulesetPath != "" {
r, err := ruleset.NewRuleset(opts.RulesetPath)
if err != nil {
panic(err)
}
rs = r
}
*/
return func(c *fiber.Ctx) error {
proxychain := proxychain.
NewProxyChain().
SetFiberCtx(c).
SetDebugLogging(opts.Verbose).
SetRequestModifications(
rx.DeleteOutgoingCookies(),
//rx.RequestArchiveIs(),
rx.MasqueradeAsGoogleBot(),
).
AddResponseModifications(
tx.BypassCORS(),
tx.BypassContentSecurityPolicy(),
tx.DeleteIncomingCookies(),
tx.RewriteHTMLResourceURLs(),
tx.PatchDynamicResourceURLs(),
).
Execute()
return proxychain
} }
isRelativePath := urlQuery.Scheme == ""
// eg: https://localhost:8080/images/foobar.jpg -> https://realsite.com/images/foobar.jpg
if isRelativePath {
// Parse the referer URL from the request header.
refererUrl, err := url.Parse(c.Get("referer"))
if err != nil {
return "", fmt.Errorf("error parsing referer URL from req: '%s': %v", reqUrl, err)
}
// Extract the real url from referer path
realUrl, err := url.Parse(strings.TrimPrefix(refererUrl.Path, "/"))
if err != nil {
return "", fmt.Errorf("error parsing real URL from referer '%s': %v", refererUrl.Path, err)
}
// reconstruct the full URL using the referer's scheme, host, and the relative path / queries
fullUrl := &url.URL{
Scheme: realUrl.Scheme,
Host: realUrl.Host,
Path: urlQuery.Path,
RawQuery: urlQuery.RawQuery,
}
if os.Getenv("LOG_URLS") == "true" {
log.Printf("modified relative URL: '%s' -> '%s'", reqUrl, fullUrl.String())
}
return fullUrl.String(), nil
}
// default behavior:
// eg: https://localhost:8080/https://realsite.com/images/foobar.jpg -> https://realsite.com/images/foobar.jpg
return urlQuery.String(), nil
} }
func ProxySite(c *fiber.Ctx) error { func modifyURL(uri string, rule ruleset.Rule) (string, error) {
// Get the url from the URL
url, err := extractUrl(c)
if err != nil {
log.Println("ERROR In URL extraction:", err)
}
queries := c.Queries()
body, _, resp, err := fetchSite(url, queries)
if err != nil {
log.Println("ERROR:", err)
c.SendStatus(fiber.StatusInternalServerError)
return c.SendString(err.Error())
}
c.Set("Content-Type", resp.Header.Get("Content-Type"))
c.Set("Content-Security-Policy", resp.Header.Get("Content-Security-Policy"))
return c.SendString(body)
}
func modifyURL(uri string, rule Rule) (string, error) {
newUrl, err := url.Parse(uri) newUrl, err := url.Parse(uri)
if err != nil { if err != nil {
return "", err return "", err
@@ -204,7 +182,7 @@ func fetchSite(urlpath string, queries map[string]string) (string, *http.Request
} }
if rule.Headers.CSP != "" { if rule.Headers.CSP != "" {
log.Println(rule.Headers.CSP) //log.Println(rule.Headers.CSP)
resp.Header.Set("Content-Security-Policy", rule.Headers.CSP) resp.Header.Set("Content-Security-Policy", rule.Headers.CSP)
} }
@@ -213,7 +191,7 @@ func fetchSite(urlpath string, queries map[string]string) (string, *http.Request
return body, req, resp, nil return body, req, resp, nil
} }
func rewriteHtml(bodyB []byte, u *url.URL, rule Rule) string { func rewriteHtml(bodyB []byte, u *url.URL, rule ruleset.Rule) string {
// Rewrite the HTML // Rewrite the HTML
body := string(bodyB) body := string(bodyB)
@@ -247,63 +225,11 @@ func getenv(key, fallback string) string {
return value return value
} }
func loadRules() RuleSet { func fetchRule(domain string, path string) ruleset.Rule {
rulesUrl := os.Getenv("RULESET")
if rulesUrl == "" {
RulesList := RuleSet{}
return RulesList
}
log.Println("Loading rules")
var ruleSet RuleSet
if strings.HasPrefix(rulesUrl, "http") {
resp, err := http.Get(rulesUrl)
if err != nil {
log.Println("ERROR:", err)
}
defer resp.Body.Close()
if resp.StatusCode >= 400 {
log.Println("ERROR:", resp.StatusCode, rulesUrl)
}
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Println("ERROR:", err)
}
yaml.Unmarshal(body, &ruleSet)
if err != nil {
log.Println("ERROR:", err)
}
} else {
yamlFile, err := os.ReadFile(rulesUrl)
if err != nil {
log.Println("ERROR:", err)
}
yaml.Unmarshal(yamlFile, &ruleSet)
}
domains := []string{}
for _, rule := range ruleSet {
domains = append(domains, rule.Domain)
domains = append(domains, rule.Domains...)
if os.Getenv("ALLOWED_DOMAINS_RULESET") == "true" {
allowedDomains = append(allowedDomains, domains...)
}
}
log.Println("Loaded ", len(ruleSet), " rules for", len(domains), "Domains")
return ruleSet
}
func fetchRule(domain string, path string) Rule {
if len(rulesSet) == 0 { if len(rulesSet) == 0 {
return Rule{} return ruleset.Rule{}
} }
rule := Rule{} rule := ruleset.Rule{}
for _, rule := range rulesSet { for _, rule := range rulesSet {
domains := rule.Domains domains := rule.Domains
if rule.Domain != "" { if rule.Domain != "" {
@@ -322,7 +248,7 @@ func fetchRule(domain string, path string) Rule {
return rule return rule
} }
func applyRules(body string, rule Rule) string { func applyRules(body string, rule ruleset.Rule) string {
if len(rulesSet) == 0 { if len(rulesSet) == 0 {
return body return body
} }

View File

@@ -2,6 +2,7 @@
package handlers package handlers
import ( import (
"ladder/pkg/ruleset"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"net/url" "net/url"
@@ -13,7 +14,7 @@ import (
func TestProxySite(t *testing.T) { func TestProxySite(t *testing.T) {
app := fiber.New() app := fiber.New()
app.Get("/:url", ProxySite) app.Get("/:url", NewProxySiteHandler(nil))
req := httptest.NewRequest("GET", "/https://example.com", nil) req := httptest.NewRequest("GET", "/https://example.com", nil)
resp, err := app.Test(req) resp, err := app.Test(req)
@@ -51,7 +52,7 @@ func TestRewriteHtml(t *testing.T) {
</html> </html>
` `
actual := rewriteHtml(bodyB, u, Rule{}) actual := rewriteHtml(bodyB, u, ruleset.Rule{})
assert.Equal(t, expected, actual) assert.Equal(t, expected, actual)
} }

View File

@@ -1,40 +0,0 @@
package handlers
type Regex struct {
Match string `yaml:"match"`
Replace string `yaml:"replace"`
}
type KV struct {
Key string `yaml:"key"`
Value string `yaml:"value"`
}
type RuleSet []Rule
type Rule struct {
Domain string `yaml:"domain,omitempty"`
Domains []string `yaml:"domains,omitempty"`
Paths []string `yaml:"paths,omitempty"`
Headers struct {
UserAgent string `yaml:"user-agent,omitempty"`
XForwardedFor string `yaml:"x-forwarded-for,omitempty"`
Referer string `yaml:"referer,omitempty"`
Cookie string `yaml:"cookie,omitempty"`
CSP string `yaml:"content-security-policy,omitempty"`
} `yaml:"headers,omitempty"`
GoogleCache bool `yaml:"googleCache,omitempty"`
RegexRules []Regex `yaml:"regexRules"`
UrlMods struct {
Domain []Regex `yaml:"domain"`
Path []Regex `yaml:"path"`
Query []KV `yaml:"query"`
} `yaml:"urlMods"`
Injections []struct {
Position string `yaml:"position"`
Append string `yaml:"append"`
Prepend string `yaml:"prepend"`
Replace string `yaml:"replace"`
} `yaml:"injections"`
}

View File

@@ -0,0 +1,105 @@
package cli
import (
"fmt"
"io"
"io/fs"
"ladder/pkg/ruleset"
"os"
"golang.org/x/term"
)
// HandleRulesetMerge merges a set of ruleset files, specified by the rulesetPath or RULESET env variable, into either YAML or Gzip format.
// Exits the program with an error message if the ruleset path is not provided or if loading the ruleset fails.
//
// Parameters:
// - rulesetPath: A pointer to a string specifying the path to the ruleset file.
// - mergeRulesets: A pointer to a boolean indicating if a merge operation should be performed.
// - mergeRulesetsGzip: A pointer to a boolean indicating if the merge should be in Gzip format.
// - mergeRulesetsOutput: A pointer to a string specifying the output file path. If empty, the output is printed to stdout.
//
// Returns:
// - An error if the ruleset loading or merging process fails, otherwise nil.
func HandleRulesetMerge(rulesetPath *string, mergeRulesets *bool, mergeRulesetsGzip *bool, mergeRulesetsOutput *string) error {
if *rulesetPath == "" {
*rulesetPath = os.Getenv("RULESET")
}
if *rulesetPath == "" {
fmt.Println("ERROR: no ruleset provided. Try again with --ruleset <ruleset.yaml>")
os.Exit(1)
}
rs, err := ruleset.NewRuleset(*rulesetPath)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
if *mergeRulesetsGzip {
return gzipMerge(rs, mergeRulesetsOutput)
}
return yamlMerge(rs, mergeRulesetsOutput)
}
// gzipMerge takes a RuleSet and an optional output file path pointer. It compresses the RuleSet into Gzip format.
// If the output file path is provided, the compressed data is written to this file. Otherwise, it prints a warning
// and outputs the binary data to stdout
//
// Parameters:
// - rs: The ruleset.RuleSet to be compressed.
// - mergeRulesetsOutput: A pointer to a string specifying the output file path. If empty, the output is directed to stdout.
//
// Returns:
// - An error if compression or file writing fails, otherwise nil.
func gzipMerge(rs ruleset.RuleSet, mergeRulesetsOutput *string) error {
gzip, err := rs.GzipYaml()
if err != nil {
return err
}
if *mergeRulesetsOutput != "" {
out, err := os.Create(*mergeRulesetsOutput)
defer out.Close()
_, err = io.Copy(out, gzip)
if err != nil {
return err
}
}
if term.IsTerminal(int(os.Stdout.Fd())) {
println("WARNING: binary output can mess up your terminal. Use '--merge-rulesets-output <ruleset.gz>' or pipe it to a file.")
os.Exit(1)
}
_, err = io.Copy(os.Stdout, gzip)
if err != nil {
return err
}
return nil
}
// yamlMerge takes a RuleSet and an optional output file path pointer. It converts the RuleSet into YAML format.
// If the output file path is provided, the YAML data is written to this file. If not, the YAML data is printed to stdout.
//
// Parameters:
// - rs: The ruleset.RuleSet to be converted to YAML.
// - mergeRulesetsOutput: A pointer to a string specifying the output file path. If empty, the output is printed to stdout.
//
// Returns:
// - An error if YAML conversion or file writing fails, otherwise nil.
func yamlMerge(rs ruleset.RuleSet, mergeRulesetsOutput *string) error {
yaml, err := rs.Yaml()
if err != nil {
return err
}
if *mergeRulesetsOutput == "" {
fmt.Printf(yaml)
os.Exit(0)
}
err = os.WriteFile(*mergeRulesetsOutput, []byte(yaml), fs.FileMode(os.O_RDWR))
if err != nil {
return fmt.Errorf("ERROR: failed to write merged YAML ruleset to '%s'\n", *mergeRulesetsOutput)
}
return nil
}

286
pkg/ruleset/ruleset.go Normal file
View File

@@ -0,0 +1,286 @@
package ruleset
import (
"errors"
"fmt"
"io"
"log"
"net/http"
"os"
"path/filepath"
"regexp"
"strings"
"compress/gzip"
"gopkg.in/yaml.v3"
)
type Regex struct {
Match string `yaml:"match"`
Replace string `yaml:"replace"`
}
type KV struct {
Key string `yaml:"key"`
Value string `yaml:"value"`
}
type RuleSet []Rule
type Rule struct {
Domain string `yaml:"domain,omitempty"`
Domains []string `yaml:"domains,omitempty"`
Paths []string `yaml:"paths,omitempty"`
Headers struct {
UserAgent string `yaml:"user-agent,omitempty"`
XForwardedFor string `yaml:"x-forwarded-for,omitempty"`
Referer string `yaml:"referer,omitempty"`
Cookie string `yaml:"cookie,omitempty"`
CSP string `yaml:"content-security-policy,omitempty"`
} `yaml:"headers,omitempty"`
GoogleCache bool `yaml:"googleCache,omitempty"`
RegexRules []Regex `yaml:"regexRules,omitempty"`
UrlMods struct {
Domain []Regex `yaml:"domain,omitempty"`
Path []Regex `yaml:"path,omitempty"`
Query []KV `yaml:"query,omitempty"`
} `yaml:"urlMods,omitempty"`
Injections []struct {
Position string `yaml:"position,omitempty"`
Append string `yaml:"append,omitempty"`
Prepend string `yaml:"prepend,omitempty"`
Replace string `yaml:"replace,omitempty"`
} `yaml:"injections,omitempty"`
}
// NewRulesetFromEnv creates a new RuleSet based on the RULESET environment variable.
// It logs a warning and returns an empty RuleSet if the RULESET environment variable is not set.
// If the RULESET is set but the rules cannot be loaded, it panics.
func NewRulesetFromEnv() RuleSet {
rulesPath, ok := os.LookupEnv("RULESET")
if !ok {
log.Printf("WARN: No ruleset specified. Set the `RULESET` environment variable to load one for a better success rate.")
return RuleSet{}
}
ruleSet, err := NewRuleset(rulesPath)
if err != nil {
log.Println(err)
}
return ruleSet
}
// NewRuleset loads a RuleSet from a given string of rule paths, separated by semicolons.
// It supports loading rules from both local file paths and remote URLs.
// Returns a RuleSet and an error if any issues occur during loading.
func NewRuleset(rulePaths string) (RuleSet, error) {
ruleSet := RuleSet{}
errs := []error{}
rp := strings.Split(rulePaths, ";")
var remoteRegex = regexp.MustCompile(`^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()!@:%_\+.~#?&\/\/=]*)`)
for _, rule := range rp {
rulePath := strings.Trim(rule, " ")
var err error
isRemote := remoteRegex.MatchString(rulePath)
if isRemote {
err = ruleSet.loadRulesFromRemoteFile(rulePath)
} else {
err = ruleSet.loadRulesFromLocalDir(rulePath)
}
if err != nil {
e := fmt.Errorf("WARN: failed to load ruleset from '%s'", rulePath)
errs = append(errs, errors.Join(e, err))
continue
}
}
if len(errs) != 0 {
e := fmt.Errorf("WARN: failed to load %d rulesets", len(rp))
errs = append(errs, e)
// panic if the user specified a local ruleset, but it wasn't found on disk
// don't fail silently
for _, err := range errs {
if errors.Is(os.ErrNotExist, err) {
e := fmt.Errorf("PANIC: ruleset '%s' not found", err)
panic(errors.Join(e, err))
}
}
// else, bubble up any errors, such as syntax or remote host issues
return ruleSet, errors.Join(errs...)
}
ruleSet.PrintStats()
return ruleSet, nil
}
// ================== RULESET loading logic ===================================
// loadRulesFromLocalDir loads rules from a local directory specified by the path.
// It walks through the directory, loading rules from YAML files.
// Returns an error if the directory cannot be accessed
// If there is an issue loading any file, it will be skipped
func (rs *RuleSet) loadRulesFromLocalDir(path string) error {
_, err := os.Stat(path)
if err != nil {
return err
}
yamlRegex := regexp.MustCompile(`.*\.ya?ml`)
err = filepath.Walk(path, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if info.IsDir() {
return nil
}
if isYaml := yamlRegex.MatchString(path); !isYaml {
return nil
}
err = rs.loadRulesFromLocalFile(path)
if err != nil {
log.Printf("WARN: failed to load directory ruleset '%s': %s, skipping", path, err)
return nil
}
log.Printf("INFO: loaded ruleset %s\n", path)
return nil
})
if err != nil {
return err
}
return nil
}
// loadRulesFromLocalFile loads rules from a local YAML file specified by the path.
// Returns an error if the file cannot be read or if there's a syntax error in the YAML.
func (rs *RuleSet) loadRulesFromLocalFile(path string) error {
yamlFile, err := os.ReadFile(path)
if err != nil {
e := fmt.Errorf("failed to read rules from local file: '%s'", path)
return errors.Join(e, err)
}
var r RuleSet
err = yaml.Unmarshal(yamlFile, &r)
if err != nil {
e := fmt.Errorf("failed to load rules from local file, possible syntax error in '%s'", path)
ee := errors.Join(e, err)
if _, ok := os.LookupEnv("DEBUG"); ok {
debugPrintRule(string(yamlFile), ee)
}
return ee
}
*rs = append(*rs, r...)
return nil
}
// loadRulesFromRemoteFile loads rules from a remote URL.
// It supports plain and gzip compressed content.
// Returns an error if there's an issue accessing the URL or if there's a syntax error in the YAML.
func (rs *RuleSet) loadRulesFromRemoteFile(rulesUrl string) error {
var r RuleSet
resp, err := http.Get(rulesUrl)
if err != nil {
e := fmt.Errorf("failed to load rules from remote url '%s'", rulesUrl)
return errors.Join(e, err)
}
defer resp.Body.Close()
if resp.StatusCode >= 400 {
e := fmt.Errorf("failed to load rules from remote url (%s) on '%s'", resp.Status, rulesUrl)
return errors.Join(e, err)
}
var reader io.Reader
isGzip := strings.HasSuffix(rulesUrl, ".gz") || strings.HasSuffix(rulesUrl, ".gzip") || resp.Header.Get("content-encoding") == "gzip"
if isGzip {
reader, err = gzip.NewReader(resp.Body)
if err != nil {
return fmt.Errorf("failed to create gzip reader for URL '%s' with status code '%s': %w", rulesUrl, resp.Status, err)
}
} else {
reader = resp.Body
}
err = yaml.NewDecoder(reader).Decode(&r)
if err != nil {
e := fmt.Errorf("failed to load rules from remote url '%s' with status code '%s' and possible syntax error", rulesUrl, resp.Status)
ee := errors.Join(e, err)
return ee
}
*rs = append(*rs, r...)
return nil
}
// ================= utility methods ==========================
// Yaml returns the ruleset as a Yaml string
func (rs *RuleSet) Yaml() (string, error) {
y, err := yaml.Marshal(rs)
if err != nil {
return "", err
}
return string(y), nil
}
// GzipYaml returns an io.Reader that streams the Gzip-compressed YAML representation of the RuleSet.
func (rs *RuleSet) GzipYaml() (io.Reader, error) {
pr, pw := io.Pipe()
go func() {
defer pw.Close()
gw := gzip.NewWriter(pw)
defer gw.Close()
if err := yaml.NewEncoder(gw).Encode(rs); err != nil {
gw.Close() // Ensure to close the gzip writer
pw.CloseWithError(err)
return
}
}()
return pr, nil
}
// Domains extracts and returns a slice of all domains present in the RuleSet.
func (rs *RuleSet) Domains() []string {
var domains []string
for _, rule := range *rs {
domains = append(domains, rule.Domain)
domains = append(domains, rule.Domains...)
}
return domains
}
// DomainCount returns the count of unique domains present in the RuleSet.
func (rs *RuleSet) DomainCount() int {
return len(rs.Domains())
}
// Count returns the total number of rules in the RuleSet.
func (rs *RuleSet) Count() int {
return len(*rs)
}
// PrintStats logs the number of rules and domains loaded in the RuleSet.
func (rs *RuleSet) PrintStats() {
log.Printf("INFO: Loaded %d rules for %d domains\n", rs.Count(), rs.DomainCount())
}
// debugPrintRule is a utility function for printing a rule and associated error for debugging purposes.
func debugPrintRule(rule string, err error) {
fmt.Println("------------------------------ BEGIN DEBUG RULESET -----------------------------")
fmt.Printf("%s\n", err.Error())
fmt.Println("--------------------------------------------------------------------------------")
fmt.Println(rule)
fmt.Println("------------------------------ END DEBUG RULESET -------------------------------")
}

153
pkg/ruleset/ruleset_test.go Normal file
View File

@@ -0,0 +1,153 @@
package ruleset
import (
"os"
"path/filepath"
"testing"
"time"
"github.com/gofiber/fiber/v2"
"github.com/stretchr/testify/assert"
)
var (
validYAML = `
- domain: example.com
regexRules:
- match: "^http:"
replace: "https:"`
invalidYAML = `
- domain: [thisIsATestYamlThatIsMeantToFail.example]
regexRules:
- match: "^http:"
replace: "https:"
- match: "[incomplete"`
)
func TestLoadRulesFromRemoteFile(t *testing.T) {
app := fiber.New()
defer app.Shutdown()
app.Get("/valid-config.yml", func(c *fiber.Ctx) error {
c.SendString(validYAML)
return nil
})
app.Get("/invalid-config.yml", func(c *fiber.Ctx) error {
c.SendString(invalidYAML)
return nil
})
app.Get("/valid-config.gz", func(c *fiber.Ctx) error {
c.Set("Content-Type", "application/octet-stream")
rs, err := loadRuleFromString(validYAML)
if err != nil {
t.Errorf("failed to load valid yaml from string: %s", err.Error())
}
s, err := rs.GzipYaml()
if err != nil {
t.Errorf("failed to load gzip serialize yaml: %s", err.Error())
}
err = c.SendStream(s)
if err != nil {
t.Errorf("failed to stream gzip serialized yaml: %s", err.Error())
}
return nil
})
// Start the server in a goroutine
go func() {
if err := app.Listen("127.0.0.1:9999"); err != nil {
t.Errorf("Server failed to start: %s", err.Error())
}
}()
// Wait for the server to start
time.Sleep(time.Second * 1)
rs, err := NewRuleset("http://127.0.0.1:9999/valid-config.yml")
if err != nil {
t.Errorf("failed to load plaintext ruleset from http server: %s", err.Error())
}
assert.Equal(t, rs[0].Domain, "example.com")
rs, err = NewRuleset("http://127.0.0.1:9999/valid-config.gz")
if err != nil {
t.Errorf("failed to load gzipped ruleset from http server: %s", err.Error())
}
assert.Equal(t, rs[0].Domain, "example.com")
os.Setenv("RULESET", "http://127.0.0.1:9999/valid-config.gz")
rs = NewRulesetFromEnv()
if !assert.Equal(t, rs[0].Domain, "example.com") {
t.Error("expected no errors loading ruleset from gzip url using environment variable, but got one")
}
}
func loadRuleFromString(yaml string) (RuleSet, error) {
// Create a temporary file and load it
tmpFile, _ := os.CreateTemp("", "ruleset*.yaml")
defer os.Remove(tmpFile.Name())
tmpFile.WriteString(yaml)
rs := RuleSet{}
err := rs.loadRulesFromLocalFile(tmpFile.Name())
return rs, err
}
// TestLoadRulesFromLocalFile tests the loading of rules from a local YAML file.
func TestLoadRulesFromLocalFile(t *testing.T) {
rs, err := loadRuleFromString(validYAML)
if err != nil {
t.Errorf("Failed to load rules from valid YAML: %s", err)
}
assert.Equal(t, rs[0].Domain, "example.com")
assert.Equal(t, rs[0].RegexRules[0].Match, "^http:")
assert.Equal(t, rs[0].RegexRules[0].Replace, "https:")
_, err = loadRuleFromString(invalidYAML)
if err == nil {
t.Errorf("Expected an error when loading invalid YAML, but got none")
}
}
// TestLoadRulesFromLocalDir tests the loading of rules from a local nested directory full of yaml rulesets
func TestLoadRulesFromLocalDir(t *testing.T) {
// Create a temporary directory
baseDir, err := os.MkdirTemp("", "ruleset_test")
if err != nil {
t.Fatalf("Failed to create temporary directory: %s", err)
}
defer os.RemoveAll(baseDir)
// Create a nested subdirectory
nestedDir := filepath.Join(baseDir, "nested")
err = os.Mkdir(nestedDir, 0755)
if err != nil {
t.Fatalf("Failed to create nested directory: %s", err)
}
// Create a nested subdirectory
nestedTwiceDir := filepath.Join(nestedDir, "nestedTwice")
err = os.Mkdir(nestedTwiceDir, 0755)
testCases := []string{"test.yaml", "test2.yaml", "test-3.yaml", "test 4.yaml", "1987.test.yaml.yml", "foobar.example.com.yaml", "foobar.com.yml"}
for _, fileName := range testCases {
filePath := filepath.Join(nestedDir, "2x-"+fileName)
os.WriteFile(filePath, []byte(validYAML), 0644)
filePath = filepath.Join(nestedDir, fileName)
os.WriteFile(filePath, []byte(validYAML), 0644)
filePath = filepath.Join(baseDir, "base-"+fileName)
os.WriteFile(filePath, []byte(validYAML), 0644)
}
rs := RuleSet{}
err = rs.loadRulesFromLocalDir(baseDir)
assert.NoError(t, err)
assert.Equal(t, rs.Count(), len(testCases)*3)
for _, rule := range rs {
assert.Equal(t, rule.Domain, "example.com")
assert.Equal(t, rule.RegexRules[0].Match, "^http:")
assert.Equal(t, rule.RegexRules[0].Replace, "https:")
}
}

427
proxychain/proxychain.go Normal file
View File

@@ -0,0 +1,427 @@
package proxychain
import (
"errors"
"fmt"
"io"
"log"
"net/http"
"net/url"
"strings"
"ladder/pkg/ruleset"
rr "ladder/proxychain/responsemodifers/rewriters"
"github.com/gofiber/fiber/v2"
)
/*
ProxyChain manages the process of forwarding an HTTP request to an upstream server,
applying request and response modifications along the way.
- It accepts incoming HTTP requests (as a Fiber *ctx), and applies
request modifiers (ReqMods) and response modifiers (ResMods) before passing the
upstream response back to the client.
- ProxyChains can be reused to avoid memory allocations. However, they are not concurrent-safe
so a ProxyChainPool should be used with mutexes to avoid memory errors.
---
# EXAMPLE
```
import (
rx "ladder/pkg/proxychain/requestmodifers"
tx "ladder/pkg/proxychain/responsemodifers"
"ladder/pkg/proxychain/responsemodifers/rewriters"
"ladder/internal/proxychain"
)
proxychain.NewProxyChain().
SetFiberCtx(c).
SetRequestModifications(
rx.BlockOutgoingCookies(),
rx.SpoofOrigin(),
rx.SpoofReferrer(),
).
SetResultModifications(
tx.BlockIncomingCookies(),
tx.RewriteHTMLResourceURLs()
).
Execute()
```
client ladder service upstream
┌─────────┐ ┌────────────────────────┐ ┌─────────┐
│ │GET │ │ │ │
│ req────┼───► ProxyChain │ │ │
│ │ │ │ │ │ │
│ │ │ ▼ │ │ │
│ │ │ apply │ │ │
│ │ │ RequestModifications │ │ │
│ │ │ │ │ │ │
│ │ │ ▼ │ │ │
│ │ │ send GET │ │ │
│ │ │ Request req────────┼─► │ │
│ │ │ │ │ │
│ │ │ 200 OK │ │ │
│ │ │ ┌────────────────┼─response │
│ │ │ ▼ │ │ │
│ │ │ apply │ │ │
│ │ │ ResultModifications │ │ │
│ │ │ │ │ │ │
│ │◄───┼───────┘ │ │ │
│ │ │ 200 OK │ │ │
│ │ │ │ │ │
└─────────┘ └────────────────────────┘ └─────────┘
*/
type ProxyChain struct {
Context *fiber.Ctx
Client *http.Client
Request *http.Request
Response *http.Response
requestModifications []RequestModification
resultModifications []ResponseModification
htmlTokenRewriters []rr.IHTMLTokenRewriter
Ruleset *ruleset.RuleSet
debugMode bool
abortErr error
}
// a ProxyStrategy is a pre-built proxychain with purpose-built defaults
type ProxyStrategy ProxyChain
// A RequestModification is a function that should operate on the
// ProxyChain Req or Client field, using the fiber ctx as needed.
type RequestModification func(*ProxyChain) error
// A ResponseModification is a function that should operate on the
// ProxyChain Res (http result) & Body (buffered http response body) field
type ResponseModification func(*ProxyChain) error
// SetRequestModifications sets the ProxyChain's request modifers
// the modifier will not fire until ProxyChain.Execute() is run.
func (chain *ProxyChain) SetRequestModifications(mods ...RequestModification) *ProxyChain {
chain.requestModifications = mods
return chain
}
// AddRequestModifications sets the ProxyChain's request modifers
// the modifier will not fire until ProxyChain.Execute() is run.
func (chain *ProxyChain) AddRequestModifications(mods ...RequestModification) *ProxyChain {
chain.requestModifications = append(chain.requestModifications, mods...)
return chain
}
// AddResponseModifications sets the ProxyChain's response modifers
// the modifier will not fire until ProxyChain.Execute() is run.
func (chain *ProxyChain) AddResponseModifications(mods ...ResponseModification) *ProxyChain {
chain.resultModifications = mods
return chain
}
// Adds a ruleset to ProxyChain
func (chain *ProxyChain) AddRuleset(rs *ruleset.RuleSet) *ProxyChain {
chain.Ruleset = rs
// TODO: add _applyRuleset method
return chain
}
func (chain *ProxyChain) _initialize_request() (*http.Request, error) {
if chain.Context == nil {
chain.abortErr = chain.abort(errors.New("no context set"))
return nil, chain.abortErr
}
// initialize a request (without url)
req, err := http.NewRequest(chain.Context.Method(), "", nil)
if err != nil {
return nil, err
}
chain.Request = req
switch chain.Context.Method() {
case "GET":
case "DELETE":
case "HEAD":
case "OPTIONS":
break
case "POST":
case "PUT":
case "PATCH":
// stream content of body from client request to upstream request
chain.Request.Body = io.NopCloser(chain.Context.Request().BodyStream())
default:
return nil, fmt.Errorf("unsupported request method from client: '%s'", chain.Context.Method())
}
/*
// copy client request headers to upstream request headers
forwardHeaders := func(key []byte, val []byte) {
req.Header.Set(string(key), string(val))
}
clientHeaders := &chain.Context.Request().Header
clientHeaders.VisitAll(forwardHeaders)
*/
return req, nil
}
// reconstructUrlFromReferer reconstructs the URL using the referer's scheme, host, and the relative path / queries
func reconstructUrlFromReferer(referer *url.URL, relativeUrl *url.URL) (*url.URL, error) {
// Extract the real url from referer path
realUrl, err := url.Parse(strings.TrimPrefix(referer.Path, "/"))
if err != nil {
return nil, fmt.Errorf("error parsing real URL from referer '%s': %v", referer.Path, err)
}
if realUrl.Scheme == "" || realUrl.Host == "" {
return nil, fmt.Errorf("invalid referer URL: '%s' on request '%s", referer.String(), relativeUrl.String())
}
log.Printf("rewrite relative URL using referer: '%s' -> '%s'\n", relativeUrl.String(), realUrl.String())
return &url.URL{
Scheme: referer.Scheme,
Host: referer.Host,
Path: realUrl.Path,
RawQuery: realUrl.RawQuery,
}, nil
}
// prevents calls like: http://localhost:8080/http://localhost:8080
func preventRecursiveProxyRequest(urlQuery *url.URL, baseProxyURL string) *url.URL {
u := urlQuery.String()
isRecursive := strings.HasPrefix(u, baseProxyURL) || u == baseProxyURL
if !isRecursive {
return urlQuery
}
fixedURL, err := url.Parse(strings.TrimPrefix(strings.TrimPrefix(urlQuery.String(), baseProxyURL), "/"))
if err != nil {
log.Printf("proxychain: failed to fix recursive request: '%s' -> '%s\n'", baseProxyURL, u)
return urlQuery
}
return preventRecursiveProxyRequest(fixedURL, baseProxyURL)
}
// extractUrl extracts a URL from the request ctx. If the URL in the request
// is a relative path, it reconstructs the full URL using the referer header.
func (chain *ProxyChain) extractUrl() (*url.URL, error) {
reqUrl := chain.Context.Params("*")
// sometimes client requests doubleroot '//'
// there is a bug somewhere else, but this is a workaround until we find it
if strings.HasPrefix(reqUrl, "/") || strings.HasPrefix(reqUrl, `%2F`) {
reqUrl = strings.TrimPrefix(reqUrl, "/")
reqUrl = strings.TrimPrefix(reqUrl, `%2F`)
}
// unescape url query
uReqUrl, err := url.QueryUnescape(reqUrl)
if err == nil {
reqUrl = uReqUrl
}
urlQuery, err := url.Parse(reqUrl)
if err != nil {
return nil, fmt.Errorf("error parsing request URL '%s': %v", reqUrl, err)
}
// prevent recursive proxy requests
fullURL := chain.Context.Request().URI()
proxyURL := fmt.Sprintf("%s://%s", fullURL.Scheme(), fullURL.Host())
urlQuery = preventRecursiveProxyRequest(urlQuery, proxyURL)
// Handle standard paths
// eg: https://localhost:8080/https://realsite.com/images/foobar.jpg -> https://realsite.com/images/foobar.jpg
isRelativePath := urlQuery.Scheme == ""
if !isRelativePath {
return urlQuery, nil
}
// Handle relative URLs
// eg: https://localhost:8080/images/foobar.jpg -> https://realsite.com/images/foobar.jpg
referer, err := url.Parse(chain.Context.Get("referer"))
relativePath := urlQuery
if err != nil {
return nil, fmt.Errorf("error parsing referer URL from req: '%s': %v", relativePath, err)
}
return reconstructUrlFromReferer(referer, relativePath)
}
// AddBodyRewriter adds a HTMLTokenRewriter to the chain.
// - HTMLTokenRewriters modify the body response by parsing the HTML
// and making changes to the DOM as it streams to the client
// - In most cases, you don't need to use this method. It's usually called by
// a ResponseModifier to batch queue changes for performance reasons.
func (chain *ProxyChain) AddHTMLTokenRewriter(rr rr.IHTMLTokenRewriter) *ProxyChain {
chain.htmlTokenRewriters = append(chain.htmlTokenRewriters, rr)
return chain
}
// SetFiberCtx takes the request ctx from the client
// for the modifiers and execute function to use.
// it must be set everytime a new request comes through
// if the upstream request url cannot be extracted from the ctx,
// a 500 error will be sent back to the client
func (chain *ProxyChain) SetFiberCtx(ctx *fiber.Ctx) *ProxyChain {
chain.Context = ctx
// initialize the request and prepare it for modification
req, err := chain._initialize_request()
if err != nil {
chain.abortErr = chain.abort(err)
}
chain.Request = req
// extract the URL for the request and add it to the new request
url, err := chain.extractUrl()
if err != nil {
chain.abortErr = chain.abort(err)
}
chain.Request.URL = url
fmt.Printf("extracted URL: %s\n", chain.Request.URL)
return chain
}
func (chain *ProxyChain) validateCtxIsSet() error {
if chain.Context != nil {
return nil
}
err := errors.New("proxyChain was called without setting a fiber Ctx. Use ProxyChain.SetCtx()")
chain.abortErr = chain.abort(err)
return chain.abortErr
}
// SetHttpClient sets a new upstream http client transport
// useful for modifying TLS
func (chain *ProxyChain) SetHttpClient(httpClient *http.Client) *ProxyChain {
chain.Client = httpClient
return chain
}
// SetVerbose changes the logging behavior to print
// the modification steps and applied rulesets for debugging
func (chain *ProxyChain) SetDebugLogging(isDebugMode bool) *ProxyChain {
chain.debugMode = isDebugMode
return chain
}
// abort proxychain and return 500 error to client
// this will prevent Execute from firing and reset the state
// returns the initial error enriched with context
func (chain *ProxyChain) abort(err error) error {
//defer chain._reset()
chain.abortErr = err
chain.Context.Response().SetStatusCode(500)
e := fmt.Errorf("ProxyChain error for '%s': %s", chain.Request.URL.String(), err.Error())
chain.Context.SendString(e.Error())
log.Println(e.Error())
return e
}
// internal function to reset state of ProxyChain for reuse
func (chain *ProxyChain) _reset() {
chain.abortErr = nil
chain.Request = nil
//chain.Response = nil
chain.Context = nil
}
// NewProxyChain initializes a new ProxyChain
func NewProxyChain() *ProxyChain {
chain := new(ProxyChain)
chain.Client = http.DefaultClient
return chain
}
/// ========================================================================================================
// _execute sends the request for the ProxyChain and returns the raw body only
// the caller is responsible for returning a response back to the requestor
// the caller is also responsible for calling chain._reset() when they are done with the body
func (chain *ProxyChain) _execute() (io.Reader, error) {
if chain.validateCtxIsSet() != nil || chain.abortErr != nil {
return nil, chain.abortErr
}
if chain.Request == nil {
return nil, errors.New("proxychain request not yet initialized")
}
if chain.Request.URL.Scheme == "" {
return nil, errors.New("request url not set or invalid. Check ProxyChain ReqMods for issues")
}
// Apply requestModifications to proxychain
for _, applyRequestModificationsTo := range chain.requestModifications {
err := applyRequestModificationsTo(chain)
if err != nil {
return nil, chain.abort(err)
}
}
// Send Request Upstream
resp, err := chain.Client.Do(chain.Request)
if err != nil {
return nil, chain.abort(err)
}
chain.Response = resp
/* todo: move to rsm
for k, v := range resp.Header {
chain.Context.Set(k, resp.Header.Get(k))
}
*/
// Apply ResponseModifiers to proxychain
for _, applyResultModificationsTo := range chain.resultModifications {
err := applyResultModificationsTo(chain)
if err != nil {
return nil, chain.abort(err)
}
}
// stream request back to client, possibly rewriting the body
if len(chain.htmlTokenRewriters) == 0 {
return chain.Response.Body, nil
}
ct := chain.Response.Header.Get("content-type")
switch {
case strings.HasPrefix(ct, "text/html"):
fmt.Println("fooox")
return rr.NewHTMLRewriter(chain.Response.Body, chain.htmlTokenRewriters), nil
default:
return chain.Response.Body, nil
}
}
// Execute sends the request for the ProxyChain and returns the request to the sender
// and resets the fields so that the ProxyChain can be reused.
// if any step in the ProxyChain fails, the request will abort and a 500 error will
// be returned to the client
func (chain *ProxyChain) Execute() error {
defer chain._reset()
body, err := chain._execute()
if err != nil {
log.Println(err)
return err
}
if chain.Context == nil {
return errors.New("no context set")
}
// Return request back to client
chain.Context.Set("content-type", chain.Response.Header.Get("content-type"))
return chain.Context.SendStream(body)
//return chain.Context.SendStream(body)
}

View File

@@ -0,0 +1,11 @@
package proxychain
import (
"net/url"
)
type ProxyChainPool map[url.URL]ProxyChain
func NewProxyChainPool() ProxyChainPool {
return map[url.URL]ProxyChain{}
}

View File

@@ -0,0 +1,33 @@
package requestmodifers
import (
"ladder/proxychain"
)
// MasqueradeAsGoogleBot modifies user agent and x-forwarded for
// to appear to be a Google Bot
func MasqueradeAsGoogleBot() proxychain.RequestModification {
const botUA string = "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; http://www.google.com/bot.html) Chrome/79.0.3945.120 Safari/537.36"
const botIP string = "66.249.78.8" // TODO: create a random ip pool from https://developers.google.com/static/search/apis/ipranges/googlebot.json
return masqueradeAsTrustedBot(botUA, botIP)
}
// MasqueradeAsBingBot modifies user agent and x-forwarded for
// to appear to be a Bing Bot
func MasqueradeAsBingBot() proxychain.RequestModification {
const botUA string = "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/79.0.3945.120 Safari/537.36"
const botIP string = "13.66.144.9" // https://www.bing.com/toolbox/bingbot.json
return masqueradeAsTrustedBot(botUA, botIP)
}
func masqueradeAsTrustedBot(botUA string, botIP string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.AddRequestModifications(
SpoofUserAgent(botUA),
SpoofXForwardedFor(botIP),
SpoofReferrer(""),
SpoofOrigin(""),
)
return nil
}
}

View File

@@ -0,0 +1,13 @@
package requestmodifers
import (
"ladder/proxychain"
"regexp"
)
func ModifyDomainWithRegex(match regexp.Regexp, replacement string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.URL.Host = match.ReplaceAllString(px.Request.URL.Host, replacement)
return nil
}
}

View File

@@ -0,0 +1,97 @@
package requestmodifers
import (
"ladder/proxychain"
"net/http"
)
// SetOutgoingCookie modifes a specific cookie name
// by modifying the request cookie headers going to the upstream server.
// If the cookie name does not already exist, it is created.
func SetOutgoingCookie(name string, val string) proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
cookies := chain.Request.Cookies()
hasCookie := false
for _, cookie := range cookies {
if cookie.Name != name {
continue
}
hasCookie = true
cookie.Value = val
}
if hasCookie {
return nil
}
chain.Request.AddCookie(&http.Cookie{
Domain: chain.Request.URL.Host,
Name: name,
Value: val,
})
return nil
}
}
// SetOutgoingCookies modifies a client request's cookie header
// to a raw Cookie string, overwriting existing cookies
func SetOutgoingCookies(cookies string) proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.Request.Header.Set("Cookies", cookies)
return nil
}
}
// DeleteOutgoingCookie modifies the http request's cookies header to
// delete a specific request cookie going to the upstream server.
// If the cookie does not exist, it does not do anything.
func DeleteOutgoingCookie(name string) proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
cookies := chain.Request.Cookies()
chain.Request.Header.Del("Cookies")
for _, cookie := range cookies {
if cookie.Name == name {
chain.Request.AddCookie(cookie)
}
}
return nil
}
}
// DeleteOutgoingCookies removes the cookie header entirely,
// preventing any cookies from reaching the upstream server.
func DeleteOutgoingCookies() proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Del("Cookie")
return nil
}
}
// DeleteOutGoingCookiesExcept prevents non-whitelisted cookies from being sent from the client
// to the upstream proxy server. Cookies whose names are in the whitelist are not removed.
func DeleteOutgoingCookiesExcept(whitelist ...string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
// Convert whitelist slice to a map for efficient lookups
whitelistMap := make(map[string]struct{})
for _, cookieName := range whitelist {
whitelistMap[cookieName] = struct{}{}
}
// Get all cookies from the request header
cookies := px.Request.Cookies()
// Clear the original Cookie header
px.Request.Header.Del("Cookie")
// Re-add cookies that are in the whitelist
for _, cookie := range cookies {
if _, found := whitelistMap[cookie.Name]; found {
px.Request.AddCookie(cookie)
}
}
return nil
}
}

View File

@@ -0,0 +1,13 @@
package requestmodifers
import (
"ladder/proxychain"
"regexp"
)
func ModifyPathWithRegex(match regexp.Regexp, replacement string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.URL.Path = match.ReplaceAllString(px.Request.URL.Path, replacement)
return nil
}
}

View File

@@ -0,0 +1,20 @@
package requestmodifers
import (
"ladder/proxychain"
)
// ModifyQueryParams replaces query parameter values in URL's query params in a ProxyChain's URL.
// If the query param key doesn't exist, it is created.
func ModifyQueryParams(key string, value string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
q := px.Request.URL.Query()
if value == "" {
q.Del(key)
return nil
}
q.Set(key, value)
px.Request.URL.RawQuery = q.Encode()
return nil
}
}

View File

@@ -0,0 +1,23 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SetRequestHeader modifies a specific outgoing header
// This is the header that the upstream server will see.
func SetRequestHeader(name string, val string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Set(name, val)
return nil
}
}
// DeleteRequestHeader modifies a specific outgoing header
// This is the header that the upstream server will see.
func DeleteRequestHeader(name string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Del(name)
return nil
}
}

View File

@@ -0,0 +1,27 @@
package requestmodifers
import (
"ladder/proxychain"
"net/url"
)
const archivistUrl string = "https://archive.is/latest/"
// RequestArchiveIs modifies a ProxyChain's URL to request an archived version from archive.is
func RequestArchiveIs() proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.URL.RawQuery = ""
newURLString := archivistUrl + px.Request.URL.String()
newURL, err := url.Parse(newURLString)
if err != nil {
return err
}
// archivist seems to sabotage requests from cloudflare's DNS
// bypass this just in case
px.AddRequestModifications(ResolveWithGoogleDoH())
px.Request.URL = newURL
return nil
}
}

View File

@@ -0,0 +1,21 @@
package requestmodifers
import (
"ladder/proxychain"
"net/url"
)
const googleCacheUrl string = "https://webcache.googleusercontent.com/search?q=cache:"
// RequestGoogleCache modifies a ProxyChain's URL to request its Google Cache version.
func RequestGoogleCache() proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
encodedURL := url.QueryEscape(px.Request.URL.String())
newURL, err := url.Parse(googleCacheUrl + encodedURL)
if err != nil {
return err
}
px.Request.URL = newURL
return nil
}
}

View File

@@ -0,0 +1,22 @@
package requestmodifers
import (
"ladder/proxychain"
"net/url"
)
const waybackUrl string = "https://web.archive.org/web/"
// RequestWaybackMachine modifies a ProxyChain's URL to request the wayback machine (archive.org) version.
func RequestWaybackMachine() proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.URL.RawQuery = ""
newURLString := waybackUrl + px.Request.URL.String()
newURL, err := url.Parse(newURLString)
if err != nil {
return err
}
px.Request.URL = newURL
return nil
}
}

View File

@@ -0,0 +1,80 @@
package requestmodifers
import (
"context"
"encoding/json"
"fmt"
"ladder/proxychain"
"net"
"net/http"
"time"
)
// resolveWithGoogleDoH resolves DNS using Google's DNS-over-HTTPS
func resolveWithGoogleDoH(host string) (string, error) {
url := "https://dns.google/resolve?name=" + host + "&type=A"
resp, err := http.Get(url)
if err != nil {
return "", err
}
defer resp.Body.Close()
var result struct {
Answer []struct {
Data string `json:"data"`
} `json:"Answer"`
}
err = json.NewDecoder(resp.Body).Decode(&result)
if err != nil {
return "", err
}
// Get the first A record
if len(result.Answer) > 0 {
return result.Answer[0].Data, nil
}
return "", fmt.Errorf("no DoH DNS record found for %s", host)
}
// ResolveWithGoogleDoH modifies a ProxyChain's client to make the request by resolving the URL
// using Google's DNS over HTTPs service
func ResolveWithGoogleDoH() proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
client := &http.Client{
Timeout: px.Client.Timeout,
}
dialer := &net.Dialer{
Timeout: 5 * time.Second,
KeepAlive: 5 * time.Second,
}
customDialContext := func(ctx context.Context, network, addr string) (net.Conn, error) {
host, port, err := net.SplitHostPort(addr)
if err != nil {
// If the addr doesn't include a port, determine it based on the URL scheme
if px.Request.URL.Scheme == "https" {
port = "443"
} else {
port = "80"
}
host = addr // assume the entire addr is the host
}
resolvedHost, err := resolveWithGoogleDoH(host)
if err != nil {
return nil, err
}
return dialer.DialContext(ctx, network, net.JoinHostPort(resolvedHost, port))
}
patchedTransportWithDoH := &http.Transport{
DialContext: customDialContext,
}
client.Transport = patchedTransportWithDoH
px.Client = client // Assign the modified client to the ProxyChain
return nil
}
}

View File

@@ -0,0 +1,24 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofOrigin modifies the origin header
// if the upstream server returns a Vary header
// it means you might get a different response if you change this
func SpoofOrigin(url string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Set("origin", url)
return nil
}
}
// HideOrigin modifies the origin header
// so that it is the original origin, not the proxy
func HideOrigin() proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Set("origin", px.Request.URL.String())
return nil
}
}

View File

@@ -0,0 +1,29 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrer modifies the referrer header
// useful if the page can be accessed from a search engine
// or social media site, but not by browsing the website itself
// if url is "", then the referrer header is removed
func SpoofReferrer(url string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
if url == "" {
px.Request.Header.Del("referrer")
return nil
}
px.Request.Header.Set("referrer", url)
return nil
}
}
// HideReferrer modifies the referrer header
// so that it is the original referrer, not the proxy
func HideReferrer() proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Set("referrer", px.Request.URL.String())
return nil
}
}

View File

@@ -0,0 +1,44 @@
package requestmodifers
import (
"fmt"
"ladder/proxychain"
"math/rand"
"strings"
"time"
)
// SpoofReferrerFromBaiduSearch modifies the referrer header
// pretending to be from a BaiduSearch
func SpoofReferrerFromBaiduSearch() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
// https://www.baidu.com/link?url=5biIeDvUIihawf3Zbbysach2Xn4H3w3FzO6LZKgSs-B5Yt4M4RUFikokOk5zetf2&wd=&eqid=9da80d8208009b8480000706655d5ed6
referrer := fmt.Sprintf("https://baidu.com/link?url=%s", generateRandomBaiduURL())
chain.AddRequestModifications(
SpoofReferrer(referrer),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}
// utility functions ==================
func generateRandomString(charset string, length int) string {
var seededRand *rand.Rand = rand.New(rand.NewSource(time.Now().UnixNano()))
var stringBuilder strings.Builder
for i := 0; i < length; i++ {
stringBuilder.WriteByte(charset[seededRand.Intn(len(charset))])
}
return stringBuilder.String()
}
func generateRandomBaiduURL() string {
const alphanumericCharset = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
const hexCharset = "0123456789abcdef"
randomAlphanumeric := generateRandomString(alphanumericCharset, 30) // Length before "-"
randomHex := generateRandomString(hexCharset, 16) // Length of eqid
return randomAlphanumeric + "-" + "&wd=&eqid=" + randomHex
}

View File

@@ -0,0 +1,20 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromBingSearch modifies the referrer header
// pretending to be from a bing search site
func SpoofReferrerFromBingSearch() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://www.bing.com/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
ModifyQueryParams("utm_source", "bing"),
)
return nil
}
}

View File

@@ -0,0 +1,20 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromGoogleSearch modifies the referrer header
// pretending to be from a google search site
func SpoofReferrerFromGoogleSearch() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://www.google.com/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
ModifyQueryParams("utm_source", "google"),
)
return nil
}
}

View File

@@ -0,0 +1,21 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromLinkedInPost modifies the referrer header
// pretending to be from a linkedin post
func SpoofReferrerFromLinkedInPost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://www.linkedin.com/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
ModifyQueryParams("utm_campaign", "post"),
ModifyQueryParams("utm_medium", "web"),
)
return nil
}
}

View File

@@ -0,0 +1,24 @@
package requestmodifers
import (
"fmt"
"ladder/proxychain"
)
// SpoofReferrerFromNaverSearch modifies the referrer header
// pretending to be from a Naver search (popular in South Korea)
func SpoofReferrerFromNaverSearch() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
referrer := fmt.Sprintf(
"https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=%s",
chain.Request.URL.Host,
)
chain.AddRequestModifications(
SpoofReferrer(referrer),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,19 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromPinterestPost modifies the referrer header
// pretending to be from a pinterest post
func SpoofReferrerFromPinterestPost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://www.pinterest.com/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,19 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromQQPost modifies the referrer header
// pretending to be from a QQ post (popular social media in China)
func SpoofReferrerFromQQPost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://new.qq.com/'"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,19 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromRedditPost modifies the referrer header
// pretending to be from a reddit post
func SpoofReferrerFromRedditPost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://www.reddit.com/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,19 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromTumblrPost modifies the referrer header
// pretending to be from a tumblr post
func SpoofReferrerFromTumblrPost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://www.tumblr.com/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,19 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromTwitterPost modifies the referrer header
// pretending to be from a twitter post
func SpoofReferrerFromTwitterPost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://t.co/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,19 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofReferrerFromVkontaktePost modifies the referrer header
// pretending to be from a vkontakte post (popular in Russia)
func SpoofReferrerFromVkontaktePost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddRequestModifications(
SpoofReferrer("https://away.vk.com/"),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,22 @@
package requestmodifers
import (
"fmt"
"ladder/proxychain"
"math/rand"
)
// SpoofReferrerFromWeiboPost modifies the referrer header
// pretending to be from a Weibo post (popular in China)
func SpoofReferrerFromWeiboPost() proxychain.RequestModification {
return func(chain *proxychain.ProxyChain) error {
referrer := fmt.Sprintf("http://weibo.com/u/%d", rand.Intn(90001))
chain.AddRequestModifications(
SpoofReferrer(referrer),
SetRequestHeader("sec-fetch-site", "cross-site"),
SetRequestHeader("sec-fetch-dest", "document"),
SetRequestHeader("sec-fetch-mode", "navigate"),
)
return nil
}
}

View File

@@ -0,0 +1,13 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofUserAgent modifies the user agent
func SpoofUserAgent(ua string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Set("user-agent", ua)
return nil
}
}

View File

@@ -0,0 +1,14 @@
package requestmodifers
import (
"ladder/proxychain"
)
// SpoofXForwardedFor modifies the X-Forwarded-For header
// in some cases, a forward proxy may interpret this as the source IP
func SpoofXForwardedFor(ip string) proxychain.RequestModification {
return func(px *proxychain.ProxyChain) error {
px.Request.Header.Set("X-FORWARDED-FOR", ip)
return nil
}
}

View File

@@ -0,0 +1,21 @@
package responsemodifers
import (
"ladder/proxychain"
)
// BypassCORS modifies response headers to prevent the browser
// from enforcing any CORS restrictions. This should run at the end of the chain.
func BypassCORS() proxychain.ResponseModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddResponseModifications(
SetResponseHeader("Access-Control-Allow-Origin", "*"),
SetResponseHeader("Access-Control-Expose-Headers", "*"),
SetResponseHeader("Access-Control-Allow-Credentials", "true"),
SetResponseHeader("Access-Control-Allow-Methods", "GET, PUT, POST, DELETE, HEAD, OPTIONS, PATCH"),
SetResponseHeader("Access-Control-Allow-Headers", "*"),
DeleteResponseHeader("X-Frame-Options"),
)
return nil
}
}

View File

@@ -0,0 +1,30 @@
package responsemodifers
import (
"ladder/proxychain"
)
// TODO: handle edge case where CSP is specified in meta tag:
// <meta http-equiv="Content-Security-Policy" content="default-src 'self'">
// BypassContentSecurityPolicy modifies response headers to prevent the browser
// from enforcing any CSP restrictions. This should run at the end of the chain.
func BypassContentSecurityPolicy() proxychain.ResponseModification {
return func(chain *proxychain.ProxyChain) error {
chain.AddResponseModifications(
DeleteResponseHeader("Content-Security-Policy"),
DeleteResponseHeader("Content-Security-Policy-Report-Only"),
DeleteResponseHeader("X-Content-Security-Policy"),
DeleteResponseHeader("X-WebKit-CSP"),
)
return nil
}
}
// SetContentSecurityPolicy modifies response headers to a specific CSP
func SetContentSecurityPolicy(csp string) proxychain.ResponseModification {
return func(chain *proxychain.ProxyChain) error {
chain.Response.Header.Set("Content-Security-Policy", csp)
return nil
}
}

View File

@@ -0,0 +1,27 @@
package responsemodifers
import (
_ "embed"
"ladder/proxychain"
"ladder/proxychain/responsemodifers/rewriters"
"strings"
)
// InjectScript modifies HTTP responses
// to execute javascript at a particular time.
func InjectScript(js string, execTime rewriters.ScriptExecTime) proxychain.ResponseModification {
return func(chain *proxychain.ProxyChain) error {
// don't add rewriter if it's not even html
ct := chain.Response.Header.Get("content-type")
if !strings.HasPrefix(ct, "text/html") {
return nil
}
// the rewriting actually happens in chain.Execute() as the client is streaming the response body back
rr := rewriters.NewScriptInjectorRewriter(js, execTime)
// we just queue it up here
chain.AddHTMLTokenRewriter(rr)
return nil
}
}

View File

@@ -0,0 +1,102 @@
package responsemodifers
import (
"fmt"
"ladder/proxychain"
"net/http"
)
// DeleteIncomingCookies prevents ALL cookies from being sent from the proxy server
// back down to the client.
func DeleteIncomingCookies(whitelist ...string) proxychain.ResponseModification {
return func(px *proxychain.ProxyChain) error {
px.Response.Header.Del("Set-Cookie")
return nil
}
}
// DeleteIncomingCookiesExcept prevents non-whitelisted cookies from being sent from the proxy server
// to the client. Cookies whose names are in the whitelist are not removed.
func DeleteIncomingCookiesExcept(whitelist ...string) proxychain.ResponseModification {
return func(px *proxychain.ProxyChain) error {
// Convert whitelist slice to a map for efficient lookups
whitelistMap := make(map[string]struct{})
for _, cookieName := range whitelist {
whitelistMap[cookieName] = struct{}{}
}
// If the response has no cookies, return early
if px.Response.Header == nil {
return nil
}
// Filter the cookies in the response
filteredCookies := []string{}
for _, cookieStr := range px.Response.Header["Set-Cookie"] {
cookie := parseCookie(cookieStr)
if _, found := whitelistMap[cookie.Name]; found {
filteredCookies = append(filteredCookies, cookieStr)
}
}
// Update the Set-Cookie header with the filtered cookies
if len(filteredCookies) > 0 {
px.Response.Header["Set-Cookie"] = filteredCookies
} else {
px.Response.Header.Del("Set-Cookie")
}
return nil
}
}
// parseCookie parses a cookie string and returns an http.Cookie object.
func parseCookie(cookieStr string) *http.Cookie {
header := http.Header{}
header.Add("Set-Cookie", cookieStr)
request := http.Request{Header: header}
return request.Cookies()[0]
}
// SetIncomingCookies adds a raw cookie string being sent from the proxy server down to the client
func SetIncomingCookies(cookies string) proxychain.ResponseModification {
return func(px *proxychain.ProxyChain) error {
px.Response.Header.Set("Set-Cookie", cookies)
return nil
}
}
// SetIncomingCookie modifies a specific cookie in the response from the proxy server to the client.
func SetIncomingCookie(name string, val string) proxychain.ResponseModification {
return func(px *proxychain.ProxyChain) error {
if px.Response.Header == nil {
return nil
}
updatedCookies := []string{}
found := false
// Iterate over existing cookies and modify the one that matches the cookieName
for _, cookieStr := range px.Response.Header["Set-Cookie"] {
cookie := parseCookie(cookieStr)
if cookie.Name == name {
// Replace the cookie with the new value
updatedCookies = append(updatedCookies, fmt.Sprintf("%s=%s", name, val))
found = true
} else {
// Keep the cookie as is
updatedCookies = append(updatedCookies, cookieStr)
}
}
// If the specified cookie wasn't found, add it
if !found {
updatedCookies = append(updatedCookies, fmt.Sprintf("%s=%s", name, val))
}
// Update the Set-Cookie header
px.Response.Header["Set-Cookie"] = updatedCookies
return nil
}
}

View File

@@ -0,0 +1,21 @@
package responsemodifers
import (
"ladder/proxychain"
)
// SetResponseHeader modifies response headers from the upstream server
func SetResponseHeader(key string, value string) proxychain.ResponseModification {
return func(px *proxychain.ProxyChain) error {
px.Context.Response().Header.Set(key, value)
return nil
}
}
// DeleteResponseHeader removes response headers from the upstream server
func DeleteResponseHeader(key string) proxychain.ResponseModification {
return func(px *proxychain.ProxyChain) error {
px.Context.Response().Header.Del(key)
return nil
}
}

View File

@@ -0,0 +1,55 @@
package responsemodifers
import (
_ "embed"
"fmt"
"ladder/proxychain"
"ladder/proxychain/responsemodifers/rewriters"
"strings"
)
//go:embed patch_dynamic_resource_urls.js
var patchDynamicResourceURLsScript string
// PatchDynamicResourceURLs patches the javascript runtime to rewrite URLs client-side.
// - This function is designed to allow the proxified page
// to still be browsible by routing all resource URLs through the proxy.
// - Native APIs capable of network requests will be hooked
// and the URLs arguments modified to point to the proxy instead.
// - fetch('/relative_path') -> fetch('/https://proxiedsite.com/relative_path')
// - Element.setAttribute('src', "/assets/img.jpg") -> Element.setAttribute('src', "/https://proxiedsite.com/assets/img.jpg") -> fetch('/https://proxiedsite.com/relative_path')
func PatchDynamicResourceURLs() proxychain.ResponseModification {
return func(chain *proxychain.ProxyChain) error {
// don't add rewriter if it's not even html
ct := chain.Response.Header.Get("content-type")
if !strings.HasPrefix(ct, "text/html") {
return nil
}
// this is the original URL sent by client:
// http://localhost:8080/http://proxiedsite.com/foo/bar
originalURI := chain.Context.Request().URI()
// this is the extracted URL that the client requests to proxy
// http://proxiedsite.com/foo/bar
reqURL := chain.Request.URL
params := map[string]string{
// ie: http://localhost:8080
"{{PROXY_ORIGIN}}": fmt.Sprintf("%s://%s", originalURI.Scheme(), originalURI.Host()),
// ie: http://proxiedsite.com
"{{ORIGIN}}": fmt.Sprintf("%s://%s", reqURL.Scheme, reqURL.Host),
}
// the rewriting actually happens in chain.Execute() as the client is streaming the response body back
rr := rewriters.NewScriptInjectorRewriterWithParams(
patchDynamicResourceURLsScript,
rewriters.BeforeDOMContentLoaded,
params,
)
// we just queue it up here
chain.AddHTMLTokenRewriter(rr)
return nil
}
}

View File

@@ -0,0 +1,325 @@
// Overrides the global fetch and XMLHttpRequest open methods to modify the request URLs.
// Also overrides the attribute setter prototype to modify the request URLs
// fetch("/relative_script.js") -> fetch("http://localhost:8080/relative_script.js")
(() => {
// ============== PARAMS ===========================
// if the original request was: http://localhost:8080/http://proxiedsite.com/foo/bar
// proxyOrigin is http://localhost:8080
const proxyOrigin = "{{PROXY_ORIGIN}}";
//const proxyOrigin = globalThis.window.location.origin;
// if the original request was: http://localhost:8080/http://proxiedsite.com/foo/bar
// origin is http://proxiedsite.com
const origin = "{{ORIGIN}}";
//const origin = (new URL(decodeURIComponent(globalThis.window.location.pathname.substring(1)))).origin
// ============== END PARAMS ======================
const blacklistedSchemes = [
"ftp:",
"mailto:",
"tel:",
"file:",
"blob:",
"javascript:",
"about:",
"magnet:",
"ws:",
"wss:",
];
function rewriteURL(url) {
const oldUrl = url
if (!url) return url
let isStr = (typeof url.startsWith === 'function')
if (!isStr) return url
// don't rewrite special URIs
if (blacklistedSchemes.includes(url)) return url;
// don't rewrite invalid URIs
try { new URL(url, origin) } catch { return url }
// don't double rewrite
if (url.startsWith(proxyOrigin)) return url;
if (url.startsWith(`/${proxyOrigin}`)) return url;
if (url.startsWith(`/${origin}`)) return url;
if (url.startsWith(`/http://`)) return url;
if (url.startsWith(`/https://`)) return url;
if (url.startsWith(`/http%3A%2F%2F`)) return url;
if (url.startsWith(`/https%3A%2F%2F`)) return url;
if (url.startsWith(`/%2Fhttp`)) return url;
//console.log(`proxychain: origin: ${origin} // proxyOrigin: ${proxyOrigin} // original: ${oldUrl}`)
if (url.startsWith("//")) {
url = `/${origin}/${encodeURIComponent(url.substring(2))}`;
} else if (url.startsWith("/")) {
url = `/${origin}/${encodeURIComponent(url.substring(1))}`;
} else if (url.startsWith(origin)) {
url = `/${encodeURIComponent(url)}`
} else if (url.startsWith("http://") || url.startsWith("https://")) {
url = `/${proxyOrigin}/${encodeURIComponent(url)}`;
}
console.log(`proxychain: rewrite JS URL: ${oldUrl} -> ${url}`)
return url;
};
// sometimes anti-bot protections like cloudflare or akamai bot manager check if JS is hooked
function hideMonkeyPatch(objectOrName, method, originalToString) {
let obj;
let isGlobalFunction = false;
if (typeof objectOrName === 'string') {
obj = globalThis[objectOrName];
isGlobalFunction = (typeof obj === 'function') && (method === objectOrName);
} else {
obj = objectOrName;
}
if (isGlobalFunction) {
const originalFunction = obj;
globalThis[objectOrName] = function(...args) {
return originalFunction.apply(this, args);
};
globalThis[objectOrName].toString = () => originalToString;
} else if (obj && typeof obj[method] === 'function') {
const originalMethod = obj[method];
obj[method] = function(...args) {
return originalMethod.apply(this, args);
};
obj[method].toString = () => originalToString;
} else {
console.warn(`proxychain: cannot hide monkey patch: ${method} is not a function on the provided object.`);
}
}
// monkey patch fetch
const oldFetch = fetch;
fetch = async (url, init) => {
return oldFetch(rewriteURL(url), init)
}
hideMonkeyPatch('fetch', 'fetch', 'function fetch() { [native code] }')
// monkey patch xmlhttprequest
const oldOpen = XMLHttpRequest.prototype.open;
XMLHttpRequest.prototype.open = function(method, url, async = true, user = null, password = null) {
return oldOpen.call(this, method, rewriteURL(url), async, user, password);
};
hideMonkeyPatch(XMLHttpRequest.prototype, 'open', 'function(){if("function"==typeof eo)return eo.apply(this,arguments)}');
const oldSend = XMLHttpRequest.prototype.send;
XMLHttpRequest.prototype.send = function(method, url) {
return oldSend.call(this, method, rewriteURL(url));
};
hideMonkeyPatch(XMLHttpRequest.prototype, 'send', 'function(){if("function"==typeof eo)return eo.apply(this,arguments)}');
// monkey patch service worker registration
const oldRegister = ServiceWorkerContainer.prototype.register;
ServiceWorkerContainer.prototype.register = function(scriptURL, options) {
return oldRegister.call(this, rewriteURL(scriptURL), options)
}
hideMonkeyPatch(ServiceWorkerContainer.prototype, 'register', 'function register() { [native code] }')
// monkey patch URL.toString() method
const oldToString = URL.prototype.toString
URL.prototype.toString = function() {
let originalURL = oldToString.call(this)
return rewriteURL(originalURL)
}
hideMonkeyPatch(URL.prototype, 'toString', 'function toString() { [native code] }')
// monkey patch URL.toJSON() method
const oldToJson = URL.prototype.toString
URL.prototype.toString = function() {
let originalURL = oldToJson.call(this)
return rewriteURL(originalURL)
}
hideMonkeyPatch(URL.prototype, 'toString', 'function toJSON() { [native code] }')
// Monkey patch URL.href getter and setter
const originalHrefDescriptor = Object.getOwnPropertyDescriptor(URL.prototype, 'href');
Object.defineProperty(URL.prototype, 'href', {
get: function() {
let originalHref = originalHrefDescriptor.get.call(this);
return rewriteURL(originalHref)
},
set: function(newValue) {
originalHrefDescriptor.set.call(this, rewriteURL(newValue));
}
});
// TODO: do one more pass of this by manually traversing the DOM
// AFTER all the JS and page has loaded just in case
// Monkey patch setter
const elements = [
{ tag: 'a', attribute: 'href' },
{ tag: 'img', attribute: 'src' },
// { tag: 'img', attribute: 'srcset' }, // TODO: handle srcset
{ tag: 'script', attribute: 'src' },
{ tag: 'link', attribute: 'href' },
{ tag: 'link', attribute: 'icon' },
{ tag: 'iframe', attribute: 'src' },
{ tag: 'audio', attribute: 'src' },
{ tag: 'video', attribute: 'src' },
{ tag: 'source', attribute: 'src' },
// { tag: 'source', attribute: 'srcset' }, // TODO: handle srcset
{ tag: 'embed', attribute: 'src' },
{ tag: 'embed', attribute: 'pluginspage' },
{ tag: 'html', attribute: 'manifest' },
{ tag: 'object', attribute: 'src' },
{ tag: 'input', attribute: 'src' },
{ tag: 'track', attribute: 'src' },
{ tag: 'form', attribute: 'action' },
{ tag: 'area', attribute: 'href' },
{ tag: 'base', attribute: 'href' },
{ tag: 'blockquote', attribute: 'cite' },
{ tag: 'del', attribute: 'cite' },
{ tag: 'ins', attribute: 'cite' },
{ tag: 'q', attribute: 'cite' },
{ tag: 'button', attribute: 'formaction' },
{ tag: 'input', attribute: 'formaction' },
{ tag: 'meta', attribute: 'content' },
{ tag: 'object', attribute: 'data' },
];
elements.forEach(({ tag, attribute }) => {
const proto = document.createElement(tag).constructor.prototype;
const descriptor = Object.getOwnPropertyDescriptor(proto, attribute);
if (descriptor && descriptor.set) {
Object.defineProperty(proto, attribute, {
...descriptor,
set(value) {
// calling rewriteURL will end up calling a setter for href,
// leading to a recusive loop and a Maximum call stack size exceeded
// error, so we guard against this with a local semaphore flag
const isRewritingSetKey = Symbol.for('isRewritingSet');
if (!this[isRewritingSetKey]) {
this[isRewritingSetKey] = true;
descriptor.set.call(this, rewriteURL(value));
//descriptor.set.call(this, value);
this[isRewritingSetKey] = false;
} else {
// Directly set the value without rewriting
descriptor.set.call(this, value);
}
},
get() {
const isRewritingGetKey = Symbol.for('isRewritingGet');
if (!this[isRewritingGetKey]) {
this[isRewritingGetKey] = true;
let oldURL = descriptor.get.call(this);
let newURL = rewriteURL(oldURL);
this[isRewritingGetKey] = false;
return newURL
} else {
return descriptor.get.call(this);
}
}
});
}
});
// sometimes, libraries will set the Element.innerHTML or Element.outerHTML directly with a string instead of setters.
// in this case, we intercept it, create a fake DOM, parse it and then rewrite all attributes that could
// contain a URL. Then we return the replacement innerHTML/outerHTML with redirected links.
function rewriteInnerHTML(html, elements) {
const isRewritingHTMLKey = Symbol.for('isRewritingHTML');
// Check if already processing
if (document[isRewritingHTMLKey]) {
return html;
}
const tempContainer = document.createElement('div');
document[isRewritingHTMLKey] = true;
try {
tempContainer.innerHTML = html;
// Create a map for quick lookup
const elementsMap = new Map(elements.map(e => [e.tag, e.attribute]));
// Loop-based DOM traversal
const nodes = [...tempContainer.querySelectorAll('*')];
for (const node of nodes) {
const attribute = elementsMap.get(node.tagName.toLowerCase());
if (attribute && node.hasAttribute(attribute)) {
const originalUrl = node.getAttribute(attribute);
const rewrittenUrl = rewriteURL(originalUrl);
node.setAttribute(attribute, rewrittenUrl);
}
}
return tempContainer.innerHTML;
} finally {
// Clear the flag
document[isRewritingHTMLKey] = false;
}
}
// Store original setters
const originalSetters = {};
['innerHTML', 'outerHTML'].forEach(property => {
const descriptor = Object.getOwnPropertyDescriptor(Element.prototype, property);
if (descriptor && descriptor.set) {
originalSetters[property] = descriptor.set;
Object.defineProperty(Element.prototype, property, {
...descriptor,
set(value) {
const isRewritingHTMLKey = Symbol.for('isRewritingHTML');
if (!this[isRewritingHTMLKey]) {
this[isRewritingHTMLKey] = true;
try {
// Use custom logic
descriptor.set.call(this, rewriteInnerHTML(value, elements));
} finally {
this[isRewritingHTMLKey] = false;
}
} else {
// Use original setter in recursive call
originalSetters[property].call(this, value);
}
}
});
}
});
})();
(() => {
document.addEventListener('DOMContentLoaded', (event) => {
initIdleMutationObserver();
});
function initIdleMutationObserver() {
let debounceTimer;
const debounceDelay = 500; // adjust the delay as needed
const observer = new MutationObserver((mutations) => {
// Clear the previous timer and set a new one
clearTimeout(debounceTimer);
debounceTimer = setTimeout(() => {
execute();
observer.disconnect(); // Disconnect after first execution
}, debounceDelay);
});
const config = { attributes: false, childList: true, subtree: true };
observer.observe(document.body, config);
}
function execute() {
console.log('DOM is now idle. Executing...');
}
})();

View File

@@ -0,0 +1,35 @@
package responsemodifers
import (
_ "embed"
"fmt"
"ladder/proxychain"
"ladder/proxychain/responsemodifers/rewriters"
"strings"
)
// RewriteHTMLResourceURLs modifies HTTP responses
// to rewrite URLs attributes in HTML content (such as src, href)
// - `<img src='/relative_path'>` -> `<img src='/https://proxiedsite.com/relative_path'>`
// - This function is designed to allow the proxified page
// to still be browsible by routing all resource URLs through the proxy.
func RewriteHTMLResourceURLs() proxychain.ResponseModification {
return func(chain *proxychain.ProxyChain) error {
// don't add rewriter if it's not even html
ct := chain.Response.Header.Get("content-type")
if !strings.HasPrefix(ct, "text/html") {
return nil
}
// proxyURL is the URL of the ladder: http://localhost:8080 (ladder)
originalURI := chain.Context.Request().URI()
proxyURL := fmt.Sprintf("%s://%s", originalURI.Scheme(), originalURI.Host())
// the rewriting actually happens in chain.Execute() as the client is streaming the response body back
rr := rewriters.NewHTMLTokenURLRewriter(chain.Request.URL, proxyURL)
// we just queue it up here
chain.AddHTMLTokenRewriter(rr)
return nil
}
}

View File

@@ -0,0 +1,27 @@
(() => {
document.addEventListener('DOMContentLoaded', (event) => {
initIdleMutationObserver();
});
function initIdleMutationObserver() {
let debounceTimer;
const debounceDelay = 500; // adjust the delay as needed
const observer = new MutationObserver((mutations) => {
// Clear the previous timer and set a new one
clearTimeout(debounceTimer);
debounceTimer = setTimeout(() => {
execute();
observer.disconnect(); // Disconnect after first execution
}, debounceDelay);
});
const config = { attributes: false, childList: true, subtree: true };
observer.observe(document.body, config);
}
function execute() {
'SCRIPT_CONTENT_PARAM'
//console.log('DOM is now idle. Executing...');
}
})();

View File

@@ -0,0 +1,3 @@
package rewriters
// todo: implement

View File

@@ -0,0 +1,131 @@
package rewriters
import (
"bytes"
"io"
"golang.org/x/net/html"
)
// IHTMLTokenRewriter defines an interface for modifying HTML tokens.
type IHTMLTokenRewriter interface {
// ShouldModify determines whether a given HTML token requires modification.
ShouldModify(*html.Token) bool
// ModifyToken applies modifications to a given HTML token.
// It returns strings representing content to be prepended and
// appended to the token. If no modifications are required or if an error occurs,
// it returns empty strings for both 'prepend' and 'append'.
// Note: The original token is not modified if an error occurs.
ModifyToken(*html.Token) (prepend, append string)
}
// HTMLRewriter is a struct that can take multiple TokenHandlers and process all
// HTML tokens from http.Response.Body in a single pass, making changes and returning a new io.ReadCloser
//
// - HTMLRewriter reads the http.Response.Body stream,
// parsing each HTML token one at a time and making modifications (defined by implementations of IHTMLTokenRewriter)
// in a single pass of the tokenizer.
//
// - When ProxyChain.Execute() is called, the response body will be read from the server
// and pulled through each ResponseModification which wraps the ProxyChain.Response.Body
// without ever buffering the entire HTTP response in memory.
type HTMLRewriter struct {
tokenizer *html.Tokenizer
currentToken *html.Token
tokenBuffer *bytes.Buffer
currentTokenProcessed bool
rewriters []IHTMLTokenRewriter
}
// NewHTMLRewriter creates a new HTMLRewriter instance.
// It processes HTML tokens from an io.ReadCloser source (typically http.Response.Body)
// using a series of HTMLTokenRewriters. Each HTMLTokenRewriter in the 'rewriters' slice
// applies its specific modifications to the HTML tokens.
// The HTMLRewriter reads from the provided 'src', applies the modifications,
// and returns the processed content as a new io.ReadCloser.
// This new io.ReadCloser can be used to stream the modified content back to the client.
//
// Parameters:
// - src: An io.ReadCloser representing the source of the HTML content, such as http.Response.Body.
// - rewriters: A slice of HTMLTokenRewriters that define the modifications to be applied to the HTML tokens.
//
// Returns:
// - A pointer to an HTMLRewriter, which implements io.ReadCloser, containing the modified HTML content.
func NewHTMLRewriter(src io.ReadCloser, rewriters []IHTMLTokenRewriter) *HTMLRewriter {
return &HTMLRewriter{
tokenizer: html.NewTokenizer(src),
currentToken: nil,
tokenBuffer: new(bytes.Buffer),
currentTokenProcessed: false,
rewriters: rewriters,
}
}
// Close resets the internal state of HTMLRewriter, clearing buffers and token data.
func (r *HTMLRewriter) Close() error {
r.tokenBuffer.Reset()
r.currentToken = nil
r.currentTokenProcessed = false
return nil
}
// Read processes the HTML content, rewriting URLs and managing the state of tokens.
func (r *HTMLRewriter) Read(p []byte) (int, error) {
if r.currentToken == nil || r.currentToken.Data == "" || r.currentTokenProcessed {
tokenType := r.tokenizer.Next()
// done reading html, close out reader
if tokenType == html.ErrorToken {
if r.tokenizer.Err() == io.EOF {
return 0, io.EOF
}
return 0, r.tokenizer.Err()
}
// get the next token; reset buffer
t := r.tokenizer.Token()
r.currentToken = &t
r.tokenBuffer.Reset()
// buffer += "<prepends> <token> <appends>"
// process token through all registered rewriters
// rewriters will modify the token, and optionally
// return a <prepend> or <append> string token
appends := make([]string, 0, len(r.rewriters))
for _, rewriter := range r.rewriters {
if !rewriter.ShouldModify(r.currentToken) {
continue
}
prepend, a := rewriter.ModifyToken(r.currentToken)
appends = append(appends, a)
// add <prepends> to buffer
r.tokenBuffer.WriteString(prepend)
}
// add <token> to buffer
if tokenType == html.TextToken {
// don't unescape textTokens (such as inline scripts).
// Token.String() by default will escape the inputs, but
// we don't want to modify the original source
r.tokenBuffer.WriteString(r.currentToken.Data)
} else {
r.tokenBuffer.WriteString(r.currentToken.String())
}
// add <appends> to buffer
for _, a := range appends {
r.tokenBuffer.WriteString(a)
}
r.currentTokenProcessed = false
}
n, err := r.tokenBuffer.Read(p)
if err == io.EOF || r.tokenBuffer.Len() == 0 {
r.currentTokenProcessed = true
err = nil // EOF in this context is expected and not an actual error
}
return n, err
}

View File

@@ -0,0 +1,263 @@
package rewriters
import (
_ "embed"
"fmt"
"log"
"net/url"
"regexp"
"strings"
"golang.org/x/net/html"
)
var rewriteAttrs map[string]map[string]bool
var specialRewriteAttrs map[string]map[string]bool
var schemeBlacklist map[string]bool
func init() {
// define all tag/attributes which might contain URLs
// to attempt to rewrite to point to proxy instead
rewriteAttrs = map[string]map[string]bool{
"img": {"src": true, "srcset": true, "longdesc": true, "usemap": true},
"a": {"href": true},
"form": {"action": true},
"link": {"href": true, "manifest": true, "icon": true},
"script": {"src": true},
"video": {"src": true, "poster": true},
"audio": {"src": true},
"iframe": {"src": true, "longdesc": true},
"embed": {"src": true},
"object": {"data": true, "codebase": true},
"source": {"src": true, "srcset": true},
"track": {"src": true},
"area": {"href": true},
"base": {"href": true},
"blockquote": {"cite": true},
"del": {"cite": true},
"ins": {"cite": true},
"q": {"cite": true},
"body": {"background": true},
"button": {"formaction": true},
"input": {"src": true, "formaction": true},
"meta": {"content": true},
}
// might contain URL but requires special handling
specialRewriteAttrs = map[string]map[string]bool{
"img": {"srcset": true},
"source": {"srcset": true},
"meta": {"content": true},
}
// define URIs to NOT rewrite
// for example: don't overwrite <img src="data:image/png;base64;iVBORw...">"
schemeBlacklist = map[string]bool{
"data": true,
"tel": true,
"mailto": true,
"file": true,
"blob": true,
"javascript": true,
"about": true,
"magnet": true,
"ws": true,
"wss": true,
"ftp": true,
}
}
// HTMLTokenURLRewriter implements HTMLTokenRewriter
// it rewrites URLs within HTML resources to use a specified proxy URL.
// <img src='/relative_path'> -> <img src='/https://proxiedsite.com/relative_path'>
type HTMLTokenURLRewriter struct {
baseURL *url.URL
proxyURL string // ladder URL, not proxied site URL
}
// NewHTMLTokenURLRewriter creates a new instance of HTMLResourceURLRewriter.
// It initializes the tokenizer with the provided source and sets the proxy URL.
func NewHTMLTokenURLRewriter(baseURL *url.URL, proxyURL string) *HTMLTokenURLRewriter {
return &HTMLTokenURLRewriter{
baseURL: baseURL,
proxyURL: proxyURL,
}
}
func (r *HTMLTokenURLRewriter) ShouldModify(token *html.Token) bool {
attrLen := len(token.Attr)
if attrLen == 0 {
return false
}
if !(token.Type == html.StartTagToken || token.Type == html.SelfClosingTagToken) {
return false
}
return true
}
func (r *HTMLTokenURLRewriter) ModifyToken(token *html.Token) (string, string) {
for i := range token.Attr {
attr := &token.Attr[i]
switch {
// don't touch tag/attributes that don't contain URIs
case !rewriteAttrs[token.Data][attr.Key]:
continue
// don't touch attributes with special URIs (like data:)
case schemeBlacklist[strings.Split(attr.Key, ":")[0]]:
continue
// don't double-overwrite the url
case strings.HasPrefix(attr.Val, r.proxyURL):
continue
case strings.HasPrefix(attr.Val, "/http://"):
continue
case strings.HasPrefix(attr.Val, "/https://"):
continue
// handle special rewrites
case specialRewriteAttrs[token.Data][attr.Key]:
r.handleSpecialAttr(token, attr, r.baseURL)
continue
default:
// rewrite url
handleURLPart(attr, r.baseURL)
}
}
return "", ""
}
// dispatcher for ModifyURL based on URI type
func handleURLPart(attr *html.Attribute, baseURL *url.URL) {
switch {
case strings.HasPrefix(attr.Key, "//"):
handleProtocolRelativePath(attr, baseURL)
case strings.HasPrefix(attr.Key, "/"):
handleRootRelativePath(attr, baseURL)
case strings.HasPrefix(attr.Key, "https://"):
handleAbsolutePath(attr, baseURL)
case strings.HasPrefix(attr.Key, "http://"):
handleAbsolutePath(attr, baseURL)
default:
handleDocumentRelativePath(attr, baseURL)
}
}
// Protocol-relative URLs: These start with "//" and will use the same protocol (http or https) as the current page.
func handleProtocolRelativePath(attr *html.Attribute, baseURL *url.URL) {
attr.Val = strings.TrimPrefix(attr.Val, "/")
handleRootRelativePath(attr, baseURL)
log.Printf("proto rel url rewritten-> '%s'='%s'", attr.Key, attr.Val)
}
// Root-relative URLs: These are relative to the root path and start with a "/".
func handleRootRelativePath(attr *html.Attribute, baseURL *url.URL) {
// doublecheck this is a valid relative URL
log.Printf("PROCESSING: key: %s val: %s\n", attr.Key, attr.Val)
_, err := url.Parse(fmt.Sprintf("http://localhost.com%s", attr.Val))
if err != nil {
log.Println(err)
return
}
//log.Printf("BASEURL patch: %s\n", baseURL)
attr.Val = fmt.Sprintf(
"/%s://%s/%s",
baseURL.Scheme,
baseURL.Host,
strings.TrimPrefix(attr.Val, "/"),
)
attr.Val = escape(attr.Val)
attr.Val = fmt.Sprintf("/%s", attr.Val)
log.Printf("root rel url rewritten-> '%s'='%s'", attr.Key, attr.Val)
}
// Document-relative URLs: These are relative to the current document's path and don't start with a "/".
func handleDocumentRelativePath(attr *html.Attribute, baseURL *url.URL) {
log.Printf("PROCESSING: key: %s val: %s\n", attr.Key, attr.Val)
attr.Val = fmt.Sprintf(
"%s://%s/%s%s",
baseURL.Scheme,
strings.Trim(baseURL.Host, "/"),
strings.Trim(baseURL.RawPath, "/"),
strings.Trim(attr.Val, "/"),
)
attr.Val = escape(attr.Val)
attr.Val = fmt.Sprintf("/%s", attr.Val)
log.Printf("doc rel url rewritten-> '%s'='%s'", attr.Key, attr.Val)
}
// full URIs beginning with https?://proxiedsite.com
func handleAbsolutePath(attr *html.Attribute, baseURL *url.URL) {
// check if valid URL
log.Printf("PROCESSING: key: %s val: %s\n", attr.Key, attr.Val)
u, err := url.Parse(attr.Val)
if err != nil {
return
}
if !(u.Scheme == "http" || u.Scheme == "https") {
return
}
attr.Val = fmt.Sprintf("/%s", escape(strings.TrimPrefix(attr.Val, "/")))
log.Printf("abs url rewritten-> '%s'='%s'", attr.Key, attr.Val)
}
// handle edge cases for special attributes
func (r *HTMLTokenURLRewriter) handleSpecialAttr(token *html.Token, attr *html.Attribute, baseURL *url.URL) {
switch {
// srcset attribute doesn't contain a single URL but a comma-separated list of URLs, each potentially followed by a space and a descriptor (like a width, pixel density, or other conditions).
case token.Data == "img" && attr.Key == "srcset":
handleSrcSet(attr, baseURL)
case token.Data == "source" && attr.Key == "srcset":
handleSrcSet(attr, baseURL)
// meta with http-equiv="refresh": The content attribute of a meta tag, when used for a refresh directive, contains a time interval followed by a URL, like content="5;url=http://example.com/".
case token.Data == "meta" && attr.Key == "content" && regexp.MustCompile(`^\d+;url=`).MatchString(attr.Val):
handleMetaRefresh(attr, baseURL)
default:
break
}
}
func handleMetaRefresh(attr *html.Attribute, baseURL *url.URL) {
sec := strings.Split(attr.Val, ";url=")[0]
url := strings.Split(attr.Val, ";url=")[1]
f := &html.Attribute{Val: url, Key: "src"}
handleURLPart(f, baseURL)
attr.Val = fmt.Sprintf("%s;url=%s", sec, url)
}
func handleSrcSet(attr *html.Attribute, baseURL *url.URL) {
var srcSetBuilder strings.Builder
srcSetItems := strings.Split(attr.Val, ",")
for i, srcItem := range srcSetItems {
srcParts := strings.Fields(srcItem) // Fields splits around whitespace, trimming them
if len(srcParts) == 0 {
continue // skip empty items
}
// rewrite each URL part by passing in fake attribute
f := &html.Attribute{Val: srcParts[0], Key: "src"}
handleURLPart(f, baseURL)
urlPart := f.Key
// First srcset item without a descriptor
if i == 0 && (len(srcParts) == 1 || !strings.HasSuffix(srcParts[1], "x")) {
srcSetBuilder.WriteString(urlPart)
} else {
srcSetBuilder.WriteString(fmt.Sprintf("%s %s", urlPart, srcParts[1]))
}
if i < len(srcSetItems)-1 {
srcSetBuilder.WriteString(",") // Add comma for all but last item
}
}
attr.Val = srcSetBuilder.String()
log.Printf("srcset url rewritten-> '%s'='%s'", attr.Key, attr.Val)
}
func escape(str string) string {
return strings.ReplaceAll(url.PathEscape(str), "%2F", "/")
}

View File

@@ -0,0 +1,91 @@
package rewriters
import (
_ "embed"
"fmt"
"sort"
"strings"
"golang.org/x/net/html"
"golang.org/x/net/html/atom"
)
// ScriptInjectorRewriter implements HTMLTokenRewriter
// ScriptInjectorRewriter is a struct that injects JS into the page
// It uses an HTML tokenizer to process HTML content and injects JS at a specified location
type ScriptInjectorRewriter struct {
execTime ScriptExecTime
script string
}
type ScriptExecTime int
const (
BeforeDOMContentLoaded ScriptExecTime = iota
AfterDOMContentLoaded
AfterDOMIdle
)
func (r *ScriptInjectorRewriter) ShouldModify(token *html.Token) bool {
// modify if token == <head>
return token.DataAtom == atom.Head && token.Type == html.StartTagToken
}
//go:embed after_dom_idle_script_injector.js
var afterDomIdleScriptInjector string
func (r *ScriptInjectorRewriter) ModifyToken(token *html.Token) (string, string) {
switch {
case r.execTime == BeforeDOMContentLoaded:
return "", fmt.Sprintf("\n<script>\n%s\n</script>\n", r.script)
case r.execTime == AfterDOMContentLoaded:
return "", fmt.Sprintf("\n<script>\ndocument.addEventListener('DOMContentLoaded', () => { %s });\n</script>", r.script)
case r.execTime == AfterDOMIdle:
s := strings.Replace(afterDomIdleScriptInjector, `'SCRIPT_CONTENT_PARAM'`, r.script, 1)
return "", fmt.Sprintf("\n<script>\n%s\n</script>\n", s)
default:
return "", ""
}
}
// applies parameters by string replacement of the template script
func (r *ScriptInjectorRewriter) applyParams(params map[string]string) {
// Sort the keys by length in descending order
keys := make([]string, 0, len(params))
for key := range params {
keys = append(keys, key)
}
sort.Slice(keys, func(i, j int) bool {
return len(keys[i]) > len(keys[j])
})
for _, key := range keys {
r.script = strings.ReplaceAll(r.script, key, params[key])
}
}
// NewScriptInjectorRewriter implements a HtmlTokenRewriter
// and injects JS into the page for execution at a particular time
func NewScriptInjectorRewriter(script string, execTime ScriptExecTime) *ScriptInjectorRewriter {
return &ScriptInjectorRewriter{
execTime: execTime,
script: script,
}
}
// NewScriptInjectorRewriterWith implements a HtmlTokenRewriter
// and injects JS into the page for execution at a particular time
// accepting arguments into the script, which will be added via a string replace
// the params map represents the key-value pair of the params.
// the key will be string replaced with the value
func NewScriptInjectorRewriterWithParams(script string, execTime ScriptExecTime, params map[string]string) *ScriptInjectorRewriter {
rr := &ScriptInjectorRewriter{
execTime: execTime,
script: script,
}
rr.applyParams(params)
return rr
}

View File

@@ -21,174 +21,3 @@
- position: h1 - position: h1
replace: | replace: |
<h1>An example with a ladder ;-)</h1> <h1>An example with a ladder ;-)</h1>
- domain: www.americanbanker.com
paths:
- /news
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const inlineGate = document.querySelector('.inline-gate');
if (inlineGate) {
inlineGate.classList.remove('inline-gate');
const inlineGated = document.querySelectorAll('.inline-gated');
for (const elem of inlineGated) { elem.classList.remove('inline-gated'); }
}
});
</script>
- domain: www.nzz.ch
paths:
- /international
- /sport
- /wirtschaft
- /technologie
- /feuilleton
- /zuerich
- /wissenschaft
- /gesellschaft
- /panorama
- /mobilitaet
- /reisen
- /meinung
- /finanze
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const paywall = document.querySelector('.dynamic-regwall');
removeDOMElement(paywall)
});
</script>
- domains:
- www.architecturaldigest.com
- www.bonappetit.com
- www.cntraveler.com
- www.epicurious.com
- www.gq.com
- www.newyorker.com
- www.vanityfair.com
- www.vogue.com
- www.wired.com
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const banners = document.querySelectorAll('.paywall-bar, div[class^="MessageBannerWrapper-"');
banners.forEach(el => { el.remove(); });
});
</script>
- domains:
- www.nytimes.com
- www.time.com
headers:
ueser-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
cookie: nyt-a=; nyt-gdpr=0; nyt-geo=DE; nyt-privacy=1
referer: https://www.google.com/
injections:
- position: head
append: |
<script>
window.localStorage.clear();
document.addEventListener("DOMContentLoaded", () => {
const banners = document.querySelectorAll('div[data-testid="inline-message"], div[id^="ad-"], div[id^="leaderboard-"], div.expanded-dock, div.pz-ad-box, div[id="top-wrapper"], div[id="bottom-wrapper"]');
banners.forEach(el => { el.remove(); });
});
</script>
- domains:
- www.thestar.com
- www.niagarafallsreview.ca
- www.stcatharinesstandard.ca
- www.thepeterboroughexaminer.com
- www.therecord.com
- www.thespec.com
- www.wellandtribune.ca
injections:
- position: head
append: |
<script>
window.localStorage.clear();
document.addEventListener("DOMContentLoaded", () => {
const paywall = document.querySelectorAll('div.subscriber-offers');
paywall.forEach(el => { el.remove(); });
const subscriber_only = document.querySelectorAll('div.subscriber-only');
for (const elem of subscriber_only) {
if (elem.classList.contains('encrypted-content') && dompurify_loaded) {
const parser = new DOMParser();
const doc = parser.parseFromString('<div>' + DOMPurify.sanitize(unscramble(elem.innerText)) + '</div>', 'text/html');
const content_new = doc.querySelector('div');
elem.parentNode.replaceChild(content_new, elem);
}
elem.removeAttribute('style');
elem.removeAttribute('class');
}
const banners = document.querySelectorAll('div.subscription-required, div.redacted-overlay, div.subscriber-hide, div.tnt-ads-container');
banners.forEach(el => { el.remove(); });
const ads = document.querySelectorAll('div.tnt-ads-container, div[class*="adLabelWrapper"]');
ads.forEach(el => { el.remove(); });
const recommendations = document.querySelectorAll('div[id^="tncms-region-article"]');
recommendations.forEach(el => { el.remove(); });
});
</script>
- domain: www.usatoday.com
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const banners = document.querySelectorAll('div.roadblock-container, .gnt_nb, [aria-label="advertisement"], div[id="main-frame-error"]');
banners.forEach(el => { el.remove(); });
});
</script>
- domain: www.washingtonpost.com
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
let paywall = document.querySelectorAll('div[data-qa$="-ad"], div[id="leaderboard-wrapper"], div[data-qa="subscribe-promo"]');
paywall.forEach(el => { el.remove(); });
const images = document.querySelectorAll('img');
images.forEach(image => { image.parentElement.style.filter = ''; });
const headimage = document.querySelectorAll('div .aspect-custom');
headimage.forEach(image => { image.style.filter = ''; });
});
</script>
- domain: medium.com
headers:
referer: https://t.co/x?amp=1
x-forwarded-for: none
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
content-security-policy: script-src 'self';
cookie:
- domain: tagesspiegel.de
headers:
content-security-policy: script-src 'self';
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
urlMods:
query:
- key: amp
value: 1
- domain: www.ft.com
headers:
referer: https://t.co/x?amp=1
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const styleTags = document.querySelectorAll('link[rel="stylesheet"]');
styleTags.forEach(el => {
const href = el.getAttribute('href').substring(1);
const updatedHref = href.replace(/(https?:\/\/.+?)\/{2,}/, '$1/');
el.setAttribute('href', updatedHref);
});
setTimeout(() => {
const cookie = document.querySelectorAll('.o-cookie-message, .js-article-ribbon, .o-ads, .o-banner, .o-message, .article__content-sign-up');
cookie.forEach(el => { el.remove(); });
}, 1000);
})
</script>

View File

@@ -0,0 +1,35 @@
- domains:
- www.thestar.com
- www.niagarafallsreview.ca
- www.stcatharinesstandard.ca
- www.thepeterboroughexaminer.com
- www.therecord.com
- www.thespec.com
- www.wellandtribune.ca
injections:
- position: head
append: |
<script>
window.localStorage.clear();
document.addEventListener("DOMContentLoaded", () => {
const paywall = document.querySelectorAll('div.subscriber-offers');
paywall.forEach(el => { el.remove(); });
const subscriber_only = document.querySelectorAll('div.subscriber-only');
for (const elem of subscriber_only) {
if (elem.classList.contains('encrypted-content') && dompurify_loaded) {
const parser = new DOMParser();
const doc = parser.parseFromString('<div>' + DOMPurify.sanitize(unscramble(elem.innerText)) + '</div>', 'text/html');
const content_new = doc.querySelector('div');
elem.parentNode.replaceChild(content_new, elem);
}
elem.removeAttribute('style');
elem.removeAttribute('class');
}
const banners = document.querySelectorAll('div.subscription-required, div.redacted-overlay, div.subscriber-hide, div.tnt-ads-container');
banners.forEach(el => { el.remove(); });
const ads = document.querySelectorAll('div.tnt-ads-container, div[class*="adLabelWrapper"]');
ads.forEach(el => { el.remove(); });
const recommendations = document.querySelectorAll('div[id^="tncms-region-article"]');
recommendations.forEach(el => { el.remove(); });
});
</script>

24
rulesets/ch/nzz-ch.yaml Normal file
View File

@@ -0,0 +1,24 @@
- domain: www.nzz.ch
paths:
- /international
- /sport
- /wirtschaft
- /technologie
- /feuilleton
- /zuerich
- /wissenschaft
- /gesellschaft
- /panorama
- /mobilitaet
- /reisen
- /meinung
- /finanze
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const paywall = document.querySelector('.dynamic-regwall');
removeDOMElement(paywall)
});
</script>

View File

@@ -0,0 +1,9 @@
# loads amp version of page
- domain: tagesspiegel.de
headers:
content-security-policy: script-src 'self';
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
urlMods:
query:
- key: amp
value: 1

20
rulesets/gb/ft-com.yaml Normal file
View File

@@ -0,0 +1,20 @@
- domain: www.ft.com
headers:
referer: https://t.co/x?amp=1
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const styleTags = document.querySelectorAll('link[rel="stylesheet"]');
styleTags.forEach(el => {
const href = el.getAttribute('href').substring(1);
const updatedHref = href.replace(/(https?:\/\/.+?)\/{2,}/, '$1/');
el.setAttribute('href', updatedHref);
});
setTimeout(() => {
const cookie = document.querySelectorAll('.o-cookie-message, .js-article-ribbon, .o-ads, .o-banner, .o-message, .article__content-sign-up');
cookie.forEach(el => { el.remove(); });
}, 1000);
})
</script>

View File

@@ -0,0 +1,19 @@
- domains:
- www.architecturaldigest.com
- www.bonappetit.com
- www.cntraveler.com
- www.epicurious.com
- www.gq.com
- www.newyorker.com
- www.vanityfair.com
- www.vogue.com
- www.wired.com
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const banners = document.querySelectorAll('.paywall-bar, div[class^="MessageBannerWrapper-"');
banners.forEach(el => { el.remove(); });
});
</script>

View File

@@ -0,0 +1,16 @@
- domain: americanbanker.com
paths:
- /news
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const inlineGate = document.querySelector('.inline-gate');
if (inlineGate) {
inlineGate.classList.remove('inline-gate');
const inlineGated = document.querySelectorAll('.inline-gated');
for (const elem of inlineGated) { elem.classList.remove('inline-gated'); }
}
});
</script>

View File

@@ -0,0 +1,7 @@
- domain: medium.com
headers:
referer: https://t.co/x?amp=1
x-forwarded-for: none
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
content-security-policy: script-src 'self';
cookie:

View File

@@ -0,0 +1,17 @@
- domains:
- www.nytimes.com
- www.time.com
headers:
ueser-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
cookie: nyt-a=; nyt-gdpr=0; nyt-geo=DE; nyt-privacy=1
referer: https://www.google.com/
injections:
- position: head
append: |
<script>
window.localStorage.clear();
document.addEventListener("DOMContentLoaded", () => {
const banners = document.querySelectorAll('div[data-testid="inline-message"], div[id^="ad-"], div[id^="leaderboard-"], div.expanded-dock, div.pz-ad-box, div[id="top-wrapper"], div[id="bottom-wrapper"]');
banners.forEach(el => { el.remove(); });
});
</script>

View File

@@ -0,0 +1,10 @@
- domain: www.usatoday.com
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
const banners = document.querySelectorAll('div.roadblock-container, .gnt_nb, [aria-label="advertisement"], div[id="main-frame-error"]');
banners.forEach(el => { el.remove(); });
});
</script>

View File

@@ -0,0 +1,14 @@
- domain: www.washingtonpost.com
injections:
- position: head
append: |
<script>
document.addEventListener("DOMContentLoaded", () => {
let paywall = document.querySelectorAll('div[data-qa$="-ad"], div[id="leaderboard-wrapper"], div[data-qa="subscribe-promo"]');
paywall.forEach(el => { el.remove(); });
const images = document.querySelectorAll('img');
images.forEach(image => { image.parentElement.style.filter = ''; });
const headimage = document.querySelectorAll('div .aspect-custom');
headimage.forEach(image => { image.style.filter = ''; });
});
</script>

91
tests/package-lock.json generated Normal file
View File

@@ -0,0 +1,91 @@
{
"name": "tests",
"version": "1.0.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "tests",
"version": "1.0.0",
"license": "ISC",
"devDependencies": {
"@playwright/test": "^1.40.0",
"@types/node": "^20.10.0"
}
},
"node_modules/@playwright/test": {
"version": "1.40.0",
"resolved": "https://registry.npmjs.org/@playwright/test/-/test-1.40.0.tgz",
"integrity": "sha512-PdW+kn4eV99iP5gxWNSDQCbhMaDVej+RXL5xr6t04nbKLCBwYtA046t7ofoczHOm8u6c+45hpDKQVZqtqwkeQg==",
"dev": true,
"dependencies": {
"playwright": "1.40.0"
},
"bin": {
"playwright": "cli.js"
},
"engines": {
"node": ">=16"
}
},
"node_modules/@types/node": {
"version": "20.10.0",
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.10.0.tgz",
"integrity": "sha512-D0WfRmU9TQ8I9PFx9Yc+EBHw+vSpIub4IDvQivcp26PtPrdMGAq5SDcpXEo/epqa/DXotVpekHiLNTg3iaKXBQ==",
"dev": true,
"dependencies": {
"undici-types": "~5.26.4"
}
},
"node_modules/fsevents": {
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz",
"integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
"dev": true,
"hasInstallScript": true,
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": "^8.16.0 || ^10.6.0 || >=11.0.0"
}
},
"node_modules/playwright": {
"version": "1.40.0",
"resolved": "https://registry.npmjs.org/playwright/-/playwright-1.40.0.tgz",
"integrity": "sha512-gyHAgQjiDf1m34Xpwzaqb76KgfzYrhK7iih+2IzcOCoZWr/8ZqmdBw+t0RU85ZmfJMgtgAiNtBQ/KS2325INXw==",
"dev": true,
"dependencies": {
"playwright-core": "1.40.0"
},
"bin": {
"playwright": "cli.js"
},
"engines": {
"node": ">=16"
},
"optionalDependencies": {
"fsevents": "2.3.2"
}
},
"node_modules/playwright-core": {
"version": "1.40.0",
"resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.40.0.tgz",
"integrity": "sha512-fvKewVJpGeca8t0ipM56jkVSU6Eo0RmFvQ/MaCQNDYm+sdvKkMBBWTE1FdeMqIdumRaXXjZChWHvIzCGM/tA/Q==",
"dev": true,
"bin": {
"playwright-core": "cli.js"
},
"engines": {
"node": ">=16"
}
},
"node_modules/undici-types": {
"version": "5.26.5",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-5.26.5.tgz",
"integrity": "sha512-JlCMO+ehdEIKqlFxk6IfVoAUVmgz7cU7zD/h9XZ0qzeosSHmUJVOzSQvvYSYWXkFXC+IfLKSIffhv0sVZup6pA==",
"dev": true
}
}
}

14
tests/package.json Normal file
View File

@@ -0,0 +1,14 @@
{
"name": "tests",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {},
"keywords": [],
"author": "",
"license": "ISC",
"devDependencies": {
"@playwright/test": "^1.40.0",
"@types/node": "^20.10.0"
}
}

View File

@@ -0,0 +1,77 @@
import { defineConfig, devices } from "@playwright/test";
/**
* Read environment variables from file.
* https://github.com/motdotla/dotenv
*/
// require('dotenv').config();
/**
* See https://playwright.dev/docs/test-configuration.
*/
export default defineConfig({
testDir: "./tests",
/* Run tests in files in parallel */
fullyParallel: true,
/* Fail the build on CI if you accidentally left test.only in the source code. */
forbidOnly: !!process.env.CI,
/* Retry on CI only */
retries: process.env.CI ? 2 : 0,
/* Opt out of parallel tests on CI. */
workers: process.env.CI ? 1 : undefined,
/* Reporter to use. See https://playwright.dev/docs/test-reporters */
reporter: "html",
/* Shared settings for all the projects below. See https://playwright.dev/docs/api/class-testoptions. */
use: {
/* Base URL to use in actions like `await page.goto('/')`. */
// baseURL: 'http://127.0.0.1:3000',
/* Collect trace when retrying the failed test. See https://playwright.dev/docs/trace-viewer */
trace: "on-first-retry",
},
/* Configure projects for major browsers */
projects: [
{
name: "chromium",
use: { ...devices["Desktop Chrome"] },
},
/*
{
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
},
{
name: 'webkit',
use: { ...devices['Desktop Safari'] },
},
*/
/* Test against mobile viewports. */
// {
// name: 'Mobile Chrome',
// use: { ...devices['Pixel 5'] },
// },
// {
// name: 'Mobile Safari',
// use: { ...devices['iPhone 12'] },
// },
/* Test against branded browsers. */
// {
// name: 'Microsoft Edge',
// use: { ...devices['Desktop Edge'], channel: 'msedge' },
// },
// {
// name: 'Google Chrome',
// use: { ...devices['Desktop Chrome'], channel: 'chrome' },
// },
],
/* Run your local dev server before starting the tests */
// webServer: {
// command: 'npm run start',
// url: 'http://127.0.0.1:3000',
// reuseExistingServer: !process.env.CI,
// },
});

2
tests/run_test.sh Normal file
View File

@@ -0,0 +1,2 @@
npx playwright test
npx playwright show-report

View File

@@ -0,0 +1,18 @@
import { expect, test } from "@playwright/test";
const paywallText = "This article is exclusive to subscribers.";
const articleURL =
"https://www.wellandtribune.ca/news/niagara-region/niagara-transit-commission-rejects-council-request-to-reduce-its-budget-increase/article_e9fb424c-8df5-58ae-a6c3-3648e2a9df66.html";
const ladderURL = "http://localhost:8080";
let domain = (new URL(articleURL)).host;
test(`${domain} has paywall by default`, async ({ page }) => {
await page.goto(articleURL);
await expect(page.getByText(paywallText)).toBeVisible();
});
test(`${domain} + Ladder doesn't have paywall`, async ({ page }) => {
await page.goto(`${ladderURL}/${articleURL}`);
await expect(page.getByText(paywallText)).toBeVisible();
});