Compare commits
43 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
7597ea2807 | ||
|
|
235dca8dd0 | ||
|
|
191279c00c | ||
|
|
f4060c3e78 | ||
|
|
55284f0b24 | ||
|
|
f7f4586032 | ||
|
|
fe881ca661 | ||
|
|
86700d8828 | ||
|
|
7be62e2735 | ||
|
|
5e76ff0879 | ||
|
|
ee641bf8f6 | ||
|
|
2fb089ea28 | ||
|
|
9f857eca8b | ||
|
|
0673255fc8 | ||
|
|
4dbc103cf7 | ||
|
|
514facd2c0 | ||
|
|
a8d920548c | ||
|
|
e87d19d7f5 | ||
|
|
531b7da811 | ||
|
|
9a53f28b3f | ||
|
|
6cbccbfadb | ||
|
|
b07d49f230 | ||
|
|
af10efb7f2 | ||
|
|
3f0f4207a1 | ||
|
|
2236c4fff9 | ||
|
|
78454f8713 | ||
|
|
6bff28e18d | ||
|
|
cdd429e4be | ||
|
|
11ee581fd4 | ||
|
|
4e44a24261 | ||
|
|
0ddb029aae | ||
|
|
a262afe035 | ||
|
|
fdca9d39d9 | ||
|
|
30a6ab501d | ||
|
|
7a51243ff4 | ||
|
|
c8b94dc702 | ||
|
|
fbc9567820 | ||
|
|
4d5c25c148 | ||
|
|
082868af2d | ||
|
|
a4abce78fb | ||
|
|
190de6d9c5 | ||
|
|
bdd19dcbb6 | ||
|
|
02e6b1c090 |
42
.github/workflows/build-css.yaml
vendored
Normal file
42
.github/workflows/build-css.yaml
vendored
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
name: Build Tailwind CSS
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
paths:
|
||||||
|
- "handlers/form.html"
|
||||||
|
workflow_dispatch:
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
tailwindbuilder:
|
||||||
|
permissions:
|
||||||
|
# Give the default GITHUB_TOKEN write permission to commit and push the
|
||||||
|
# added or changed files to the repository.
|
||||||
|
contents: write
|
||||||
|
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
-
|
||||||
|
name: Checkout
|
||||||
|
uses: actions/checkout@v3
|
||||||
|
-
|
||||||
|
name: Install pnpm
|
||||||
|
uses: pnpm/action-setup@v2
|
||||||
|
with:
|
||||||
|
version: 8
|
||||||
|
-
|
||||||
|
name: Build Tailwind CSS
|
||||||
|
run: pnpm build
|
||||||
|
-
|
||||||
|
name: Commit generated stylesheet
|
||||||
|
run: |
|
||||||
|
if git diff --quiet cmd/styles.css; then
|
||||||
|
echo "No changes to commit."
|
||||||
|
exit 0
|
||||||
|
else
|
||||||
|
echo "Changes detected, committing..."
|
||||||
|
git config --global user.name "Github action"
|
||||||
|
git config --global user.email "username@users.noreply.github.com"
|
||||||
|
git add cmd
|
||||||
|
git commit -m "Generated stylesheet"
|
||||||
|
git push
|
||||||
|
fi
|
||||||
1
.gitignore
vendored
1
.gitignore
vendored
@@ -2,3 +2,4 @@
|
|||||||
ladder
|
ladder
|
||||||
|
|
||||||
VERSION
|
VERSION
|
||||||
|
output.css
|
||||||
39
README.md
39
README.md
@@ -17,7 +17,7 @@ Freedom of information is an essential pillar of democracy and informed decision
|
|||||||
### Features
|
### Features
|
||||||
- [x] Bypass Paywalls
|
- [x] Bypass Paywalls
|
||||||
- [x] Remove CORS headers from responses, assets, and images ...
|
- [x] Remove CORS headers from responses, assets, and images ...
|
||||||
- [x] Apply domain based ruleset/code to modify response
|
- [x] Apply domain based ruleset/code to modify response / requested URL
|
||||||
- [x] Keep site browsable
|
- [x] Keep site browsable
|
||||||
- [x] API
|
- [x] API
|
||||||
- [x] Fetch RAW HTML
|
- [x] Fetch RAW HTML
|
||||||
@@ -38,10 +38,10 @@ Freedom of information is an essential pillar of democracy and informed decision
|
|||||||
- [ ] A key to share only one URL
|
- [ ] A key to share only one URL
|
||||||
|
|
||||||
### Limitations
|
### Limitations
|
||||||
Certain sites may display missing images or encounter formatting issues. This can be attributed to the site's reliance on JavaScript or CSS for image and resource loading, which presents a limitation when accessed through this proxy. If you prefer a full experience, please consider buying a subscription for the site.
|
|
||||||
|
|
||||||
Some sites do not expose their content to search engines, which means that the proxy cannot access the content. A future version will try to fetch the content from Google Cache.
|
Some sites do not expose their content to search engines, which means that the proxy cannot access the content. A future version will try to fetch the content from Google Cache.
|
||||||
|
|
||||||
|
Certain sites may display missing images or encounter formatting issues. This can be attributed to the site's reliance on JavaScript or CSS for image and resource loading, which presents a limitation when accessed through this proxy. If you prefer a full experience, please consider buying a subscription for the site.
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
> **Warning:** If your instance will be publicly accessible, make sure to enable Basic Auth. This will prevent unauthorized users from using your proxy. If you do not enable Basic Auth, anyone can use your proxy to browse nasty/illegal stuff. And you will be responsible for it.
|
> **Warning:** If your instance will be publicly accessible, make sure to enable Basic Auth. This will prevent unauthorized users from using your proxy. If you do not enable Basic Auth, anyone can use your proxy to browse nasty/illegal stuff. And you will be responsible for it.
|
||||||
@@ -106,7 +106,7 @@ http://localhost:8080/ruleset
|
|||||||
| `LOG_URLS` | Log fetched URL's | `true` |
|
| `LOG_URLS` | Log fetched URL's | `true` |
|
||||||
| `DISABLE_FORM` | Disables URL Form Frontpage | `false` |
|
| `DISABLE_FORM` | Disables URL Form Frontpage | `false` |
|
||||||
| `FORM_PATH` | Path to custom Form HTML | `` |
|
| `FORM_PATH` | Path to custom Form HTML | `` |
|
||||||
| `RULESET` | URL to a ruleset file | `https://raw.githubusercontent.com/everywall/ladder/main/ruleset.yaml` or `/path/to/my/rules.yaml` |
|
| `RULESET` | Path or URL to a ruleset file, accepts local directories | `https://raw.githubusercontent.com/everywall/ladder/main/ruleset.yaml` or `/path/to/my/rules.yaml` or `/path/to/my/rules/` |
|
||||||
| `EXPOSE_RULESET` | Make your Ruleset available to other ladders | `true` |
|
| `EXPOSE_RULESET` | Make your Ruleset available to other ladders | `true` |
|
||||||
| `ALLOWED_DOMAINS` | Comma separated list of allowed domains. Empty = no limitations | `` |
|
| `ALLOWED_DOMAINS` | Comma separated list of allowed domains. Empty = no limitations | `` |
|
||||||
| `ALLOWED_DOMAINS_RULESET` | Allow Domains from Ruleset. false = no limitations | `false` |
|
| `ALLOWED_DOMAINS_RULESET` | Allow Domains from Ruleset. false = no limitations | `false` |
|
||||||
@@ -115,12 +115,12 @@ http://localhost:8080/ruleset
|
|||||||
|
|
||||||
### Ruleset
|
### Ruleset
|
||||||
|
|
||||||
It is possible to apply custom rules to modify the response. This can be used to remove unwanted or modify elements from the page. The ruleset is a YAML file that contains a list of rules for each domain and is loaded on startup
|
It is possible to apply custom rules to modify the response or the requested URL. This can be used to remove unwanted or modify elements from the page. The ruleset is a YAML file that contains a list of rules for each domain and is loaded on startup
|
||||||
|
|
||||||
See in [ruleset.yaml](ruleset.yaml) for an example.
|
See in [ruleset.yaml](ruleset.yaml) for an example.
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
- domain: example.com # Inbcludes all subdomains
|
- domain: example.com # Includes all subdomains
|
||||||
domains: # Additional domains to apply the rule
|
domains: # Additional domains to apply the rule
|
||||||
- www.example.de
|
- www.example.de
|
||||||
- www.beispiel.de
|
- www.beispiel.de
|
||||||
@@ -154,5 +154,30 @@ See in [ruleset.yaml](ruleset.yaml) for an example.
|
|||||||
<h1>My Custom Title</h1>
|
<h1>My Custom Title</h1>
|
||||||
- position: .left-content article # Position where to inject the code into DOM
|
- position: .left-content article # Position where to inject the code into DOM
|
||||||
prepend: |
|
prepend: |
|
||||||
<h2>Suptitle</h2>
|
<h2>Subtitle</h2>
|
||||||
|
- domain: demo.com
|
||||||
|
headers:
|
||||||
|
content-security-policy: script-src 'self';
|
||||||
|
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
|
||||||
|
urlMods: # Modify the URL
|
||||||
|
query:
|
||||||
|
- key: amp # (this will append ?amp=1 to the URL)
|
||||||
|
value: 1
|
||||||
|
domain:
|
||||||
|
- match: www # regex to match part of domain
|
||||||
|
replace: amp # (this would modify the domain from www.demo.de to amp.demo.de)
|
||||||
|
path:
|
||||||
|
- match: ^ # regex to match part of path
|
||||||
|
replace: /amp/ # (modify the url from https://www.demo.com/article/ to https://www.demo.de/amp/article/)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Development
|
||||||
|
|
||||||
|
To run a development server at http://localhost:8080:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
echo "dev" > handlers/VERSION
|
||||||
|
RULESET="./ruleset.yaml" go run cmd/main.go
|
||||||
|
```
|
||||||
|
|
||||||
|
This project uses [pnpm](https://pnpm.io/) to build a stylesheet with the [Tailwind CSS](https://tailwindcss.com/) classes. For local development, if you modify styles in `form.html`, run `pnpm build` to generate a new stylesheet.
|
||||||
|
|||||||
21
cmd/main.go
21
cmd/main.go
@@ -1,7 +1,7 @@
|
|||||||
package main
|
package main
|
||||||
|
|
||||||
import (
|
import (
|
||||||
_ "embed"
|
"embed"
|
||||||
"fmt"
|
"fmt"
|
||||||
"log"
|
"log"
|
||||||
"os"
|
"os"
|
||||||
@@ -17,6 +17,8 @@ import (
|
|||||||
|
|
||||||
//go:embed favicon.ico
|
//go:embed favicon.ico
|
||||||
var faviconData string
|
var faviconData string
|
||||||
|
//go:embed styles.css
|
||||||
|
var cssData embed.FS
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
parser := argparse.NewParser("ladder", "Every Wall needs a Ladder")
|
parser := argparse.NewParser("ladder", "Every Wall needs a Ladder")
|
||||||
@@ -36,6 +38,11 @@ func main() {
|
|||||||
Help: "This will spawn multiple processes listening",
|
Help: "This will spawn multiple processes listening",
|
||||||
})
|
})
|
||||||
|
|
||||||
|
ruleset := parser.String("r", "ruleset", &argparse.Options{
|
||||||
|
Required: false,
|
||||||
|
Help: "File, Directory or URL to a ruleset.yml. Overrides RULESET environment variable.",
|
||||||
|
})
|
||||||
|
|
||||||
err := parser.Parse(os.Args)
|
err := parser.Parse(os.Args)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
fmt.Print(parser.Usage(err))
|
fmt.Print(parser.Usage(err))
|
||||||
@@ -75,12 +82,18 @@ func main() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
app.Get("/", handlers.Form)
|
app.Get("/", handlers.Form)
|
||||||
|
app.Get("/styles.css", func(c *fiber.Ctx) error {
|
||||||
|
cssData, err := cssData.ReadFile("styles.css")
|
||||||
|
if err != nil {
|
||||||
|
return c.Status(fiber.StatusInternalServerError).SendString("Internal Server Error")
|
||||||
|
}
|
||||||
|
c.Set("Content-Type", "text/css")
|
||||||
|
return c.Send(cssData)
|
||||||
|
})
|
||||||
app.Get("ruleset", handlers.Ruleset)
|
app.Get("ruleset", handlers.Ruleset)
|
||||||
|
|
||||||
app.Get("raw/*", handlers.Raw)
|
app.Get("raw/*", handlers.Raw)
|
||||||
app.Get("api/*", handlers.Api)
|
app.Get("api/*", handlers.Api)
|
||||||
app.Get("ruleset", handlers.Raw)
|
app.Get("/*", handlers.ProxySite(*ruleset))
|
||||||
app.Get("/*", handlers.ProxySite)
|
|
||||||
|
|
||||||
log.Fatal(app.Listen(":" + *port))
|
log.Fatal(app.Listen(":" + *port))
|
||||||
}
|
}
|
||||||
|
|||||||
1
cmd/styles.css
Normal file
1
cmd/styles.css
Normal file
File diff suppressed because one or more lines are too long
@@ -9,10 +9,11 @@ services:
|
|||||||
environment:
|
environment:
|
||||||
- PORT=8080
|
- PORT=8080
|
||||||
- RULESET=/app/ruleset.yaml
|
- RULESET=/app/ruleset.yaml
|
||||||
|
#- ALLOWED_DOMAINS=example.com,example.org
|
||||||
#- ALLOWED_DOMAINS_RULESET=false
|
#- ALLOWED_DOMAINS_RULESET=false
|
||||||
#- EXPOSE_RULESET=true
|
#- EXPOSE_RULESET=true
|
||||||
#- PREFORK=false
|
#- PREFORK=false
|
||||||
#- DISABLE_FORM=fase
|
#- DISABLE_FORM=false
|
||||||
#- FORM_PATH=/app/form.html
|
#- FORM_PATH=/app/form.html
|
||||||
#- X_FORWARDED_FOR=66.249.66.1
|
#- X_FORWARDED_FOR=66.249.66.1
|
||||||
#- USER_AGENT=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
|
#- USER_AGENT=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
|
||||||
|
|||||||
@@ -3,8 +3,9 @@
|
|||||||
|
|
||||||
<head>
|
<head>
|
||||||
<meta charset="UTF-8">
|
<meta charset="UTF-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
<title>ladder</title>
|
<title>ladder</title>
|
||||||
<script src="https://cdn.tailwindcss.com"></script>
|
<link rel="stylesheet" href="/styles.css">
|
||||||
</head>
|
</head>
|
||||||
|
|
||||||
<body class="antialiased text-slate-500 dark:text-slate-400 bg-white dark:bg-slate-900">
|
<body class="antialiased text-slate-500 dark:text-slate-400 bg-white dark:bg-slate-900">
|
||||||
@@ -16,12 +17,15 @@
|
|||||||
<header>
|
<header>
|
||||||
<h1 class="text-center text-3xl sm:text-4xl font-extrabold text-slate-900 tracking-tight dark:text-slate-200">ladddddddder</h1>
|
<h1 class="text-center text-3xl sm:text-4xl font-extrabold text-slate-900 tracking-tight dark:text-slate-200">ladddddddder</h1>
|
||||||
</header>
|
</header>
|
||||||
<form id="inputForm" method="get" class="mx-4">
|
<form id="inputForm" method="get" class="mx-4 relative">
|
||||||
<div>
|
<div>
|
||||||
<input type="text" id="inputField" placeholder="Proxy Search" name="inputField" class="w-full text-sm leading-6 text-slate-400 rounded-md ring-1 ring-slate-900/10 shadow-sm py-1.5 pl-2 pr-3 hover:ring-slate-300 dark:bg-slate-800 dark:highlight-white/5 dark:hover:bg-slate-700" required autofocus>
|
<input type="text" id="inputField" placeholder="Proxy Search" name="inputField" class="w-full text-sm leading-6 text-slate-400 rounded-md ring-1 ring-slate-900/10 shadow-sm py-1.5 pl-2 pr-3 hover:ring-slate-300 dark:bg-slate-800 dark:highlight-white/5 dark:hover:bg-slate-700" required autofocus>
|
||||||
|
<button id="clearButton" type="button" aria-label="Clear Search" title="Clear Search" class="hidden absolute inset-y-0 right-0 items-center pr-2 hover:text-slate-400 hover:dark:text-slate-300" tabindex="-1">
|
||||||
|
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round""><path d="M18 6 6 18"/><path d="m6 6 12 12"/></svg>
|
||||||
|
</button>
|
||||||
</div>
|
</div>
|
||||||
</form>
|
</form>
|
||||||
<footer class="mt-10 text-center text-slate-600 dark:text-slate-400">
|
<footer class="mt-10 mx-4 text-center text-slate-600 dark:text-slate-400">
|
||||||
<p>
|
<p>
|
||||||
Code Licensed Under GPL v3.0 |
|
Code Licensed Under GPL v3.0 |
|
||||||
<a href="https://github.com/everywall/ladder" class="hover:text-blue-500 hover:underline underline-offset-2 transition-colors duration-300">View Source</a> |
|
<a href="https://github.com/everywall/ladder" class="hover:text-blue-500 hover:underline underline-offset-2 transition-colors duration-300">View Source</a> |
|
||||||
@@ -31,9 +35,6 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
<script>
|
<script>
|
||||||
document.addEventListener('DOMContentLoaded', function () {
|
|
||||||
M.AutoInit();
|
|
||||||
});
|
|
||||||
document.getElementById('inputForm').addEventListener('submit', function (e) {
|
document.getElementById('inputForm').addEventListener('submit', function (e) {
|
||||||
e.preventDefault();
|
e.preventDefault();
|
||||||
let url = document.getElementById('inputField').value;
|
let url = document.getElementById('inputField').value;
|
||||||
@@ -43,6 +44,19 @@
|
|||||||
window.location.href = '/' + url;
|
window.location.href = '/' + url;
|
||||||
return false;
|
return false;
|
||||||
});
|
});
|
||||||
|
document.getElementById('inputField').addEventListener('input', function() {
|
||||||
|
const clearButton = document.getElementById('clearButton');
|
||||||
|
if (this.value.trim().length > 0) {
|
||||||
|
clearButton.style.display = 'block';
|
||||||
|
} else {
|
||||||
|
clearButton.style.display = 'none';
|
||||||
|
}
|
||||||
|
});
|
||||||
|
document.getElementById('clearButton').addEventListener('click', function() {
|
||||||
|
document.getElementById('inputField').value = '';
|
||||||
|
this.style.display = 'none';
|
||||||
|
document.getElementById('inputField').focus();
|
||||||
|
});
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
<style>
|
<style>
|
||||||
|
|||||||
@@ -10,21 +10,94 @@ import (
|
|||||||
"regexp"
|
"regexp"
|
||||||
"strings"
|
"strings"
|
||||||
|
|
||||||
|
"ladder/pkg/ruleset"
|
||||||
|
|
||||||
"github.com/PuerkitoBio/goquery"
|
"github.com/PuerkitoBio/goquery"
|
||||||
"github.com/gofiber/fiber/v2"
|
"github.com/gofiber/fiber/v2"
|
||||||
"gopkg.in/yaml.v3"
|
|
||||||
)
|
)
|
||||||
|
|
||||||
var (
|
var (
|
||||||
UserAgent = getenv("USER_AGENT", "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")
|
UserAgent = getenv("USER_AGENT", "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")
|
||||||
ForwardedFor = getenv("X_FORWARDED_FOR", "66.249.66.1")
|
ForwardedFor = getenv("X_FORWARDED_FOR", "66.249.66.1")
|
||||||
rulesSet = loadRules()
|
rulesSet = ruleset.NewRulesetFromEnv()
|
||||||
allowedDomains = strings.Split(os.Getenv("ALLOWED_DOMAINS"), ",")
|
allowedDomains = []string{}
|
||||||
)
|
)
|
||||||
|
|
||||||
func ProxySite(c *fiber.Ctx) error {
|
func init() {
|
||||||
|
allowedDomains = strings.Split(os.Getenv("ALLOWED_DOMAINS"), ",")
|
||||||
|
if os.Getenv("ALLOWED_DOMAINS_RULESET") == "true" {
|
||||||
|
allowedDomains = append(allowedDomains, rulesSet.Domains()...)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// extracts a URL from the request ctx. If the URL in the request
|
||||||
|
// is a relative path, it reconstructs the full URL using the referer header.
|
||||||
|
func extractUrl(c *fiber.Ctx) (string, error) {
|
||||||
|
// try to extract url-encoded
|
||||||
|
reqUrl, err := url.QueryUnescape(c.Params("*"))
|
||||||
|
if err != nil {
|
||||||
|
// fallback
|
||||||
|
reqUrl = c.Params("*")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract the actual path from req ctx
|
||||||
|
urlQuery, err := url.Parse(reqUrl)
|
||||||
|
if err != nil {
|
||||||
|
return "", fmt.Errorf("error parsing request URL '%s': %v", reqUrl, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
isRelativePath := urlQuery.Scheme == ""
|
||||||
|
|
||||||
|
// eg: https://localhost:8080/images/foobar.jpg -> https://realsite.com/images/foobar.jpg
|
||||||
|
if isRelativePath {
|
||||||
|
// Parse the referer URL from the request header.
|
||||||
|
refererUrl, err := url.Parse(c.Get("referer"))
|
||||||
|
if err != nil {
|
||||||
|
return "", fmt.Errorf("error parsing referer URL from req: '%s': %v", reqUrl, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract the real url from referer path
|
||||||
|
realUrl, err := url.Parse(strings.TrimPrefix(refererUrl.Path, "/"))
|
||||||
|
if err != nil {
|
||||||
|
return "", fmt.Errorf("error parsing real URL from referer '%s': %v", refererUrl.Path, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// reconstruct the full URL using the referer's scheme, host, and the relative path / queries
|
||||||
|
fullUrl := &url.URL{
|
||||||
|
Scheme: realUrl.Scheme,
|
||||||
|
Host: realUrl.Host,
|
||||||
|
Path: urlQuery.Path,
|
||||||
|
RawQuery: urlQuery.RawQuery,
|
||||||
|
}
|
||||||
|
|
||||||
|
if os.Getenv("LOG_URLS") == "true" {
|
||||||
|
log.Printf("modified relative URL: '%s' -> '%s'", reqUrl, fullUrl.String())
|
||||||
|
}
|
||||||
|
return fullUrl.String(), nil
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
// default behavior:
|
||||||
|
// eg: https://localhost:8080/https://realsite.com/images/foobar.jpg -> https://realsite.com/images/foobar.jpg
|
||||||
|
return urlQuery.String(), nil
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
func ProxySite(rulesetPath string) fiber.Handler {
|
||||||
|
if rulesetPath != "" {
|
||||||
|
rs, err := ruleset.NewRuleset(rulesetPath)
|
||||||
|
if err != nil {
|
||||||
|
panic(err)
|
||||||
|
}
|
||||||
|
rulesSet = rs
|
||||||
|
}
|
||||||
|
|
||||||
|
return func(c *fiber.Ctx) error {
|
||||||
// Get the url from the URL
|
// Get the url from the URL
|
||||||
url := c.Params("*")
|
url, err := extractUrl(c)
|
||||||
|
if err != nil {
|
||||||
|
log.Println("ERROR In URL extraction:", err)
|
||||||
|
}
|
||||||
|
|
||||||
queries := c.Queries()
|
queries := c.Queries()
|
||||||
body, _, resp, err := fetchSite(url, queries)
|
body, _, resp, err := fetchSite(url, queries)
|
||||||
@@ -34,10 +107,48 @@ func ProxySite(c *fiber.Ctx) error {
|
|||||||
return c.SendString(err.Error())
|
return c.SendString(err.Error())
|
||||||
}
|
}
|
||||||
|
|
||||||
|
c.Cookie(&fiber.Cookie{})
|
||||||
c.Set("Content-Type", resp.Header.Get("Content-Type"))
|
c.Set("Content-Type", resp.Header.Get("Content-Type"))
|
||||||
c.Set("Content-Security-Policy", resp.Header.Get("Content-Security-Policy"))
|
c.Set("Content-Security-Policy", resp.Header.Get("Content-Security-Policy"))
|
||||||
|
|
||||||
return c.SendString(body)
|
return c.SendString(body)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func modifyURL(uri string, rule ruleset.Rule) (string, error) {
|
||||||
|
newUrl, err := url.Parse(uri)
|
||||||
|
if err != nil {
|
||||||
|
return "", err
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, urlMod := range rule.UrlMods.Domain {
|
||||||
|
re := regexp.MustCompile(urlMod.Match)
|
||||||
|
newUrl.Host = re.ReplaceAllString(newUrl.Host, urlMod.Replace)
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, urlMod := range rule.UrlMods.Path {
|
||||||
|
re := regexp.MustCompile(urlMod.Match)
|
||||||
|
newUrl.Path = re.ReplaceAllString(newUrl.Path, urlMod.Replace)
|
||||||
|
}
|
||||||
|
|
||||||
|
v := newUrl.Query()
|
||||||
|
for _, query := range rule.UrlMods.Query {
|
||||||
|
if query.Value == "" {
|
||||||
|
v.Del(query.Key)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
v.Set(query.Key, query.Value)
|
||||||
|
}
|
||||||
|
newUrl.RawQuery = v.Encode()
|
||||||
|
|
||||||
|
if rule.GoogleCache {
|
||||||
|
newUrl, err = url.Parse("https://webcache.googleusercontent.com/search?q=cache:" + newUrl.String())
|
||||||
|
if err != nil {
|
||||||
|
return "", err
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return newUrl.String(), nil
|
||||||
}
|
}
|
||||||
|
|
||||||
func fetchSite(urlpath string, queries map[string]string) (string, *http.Request, *http.Response, error) {
|
func fetchSite(urlpath string, queries map[string]string) (string, *http.Request, *http.Response, error) {
|
||||||
@@ -63,18 +174,16 @@ func fetchSite(urlpath string, queries map[string]string) (string, *http.Request
|
|||||||
log.Println(u.String() + urlQuery)
|
log.Println(u.String() + urlQuery)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Modify the URI according to ruleset
|
||||||
rule := fetchRule(u.Host, u.Path)
|
rule := fetchRule(u.Host, u.Path)
|
||||||
|
url, err := modifyURL(u.String()+urlQuery, rule)
|
||||||
if rule.GoogleCache {
|
|
||||||
u, err = url.Parse("https://webcache.googleusercontent.com/search?q=cache:" + u.String())
|
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", nil, nil, err
|
return "", nil, nil, err
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
// Fetch the site
|
// Fetch the site
|
||||||
client := &http.Client{}
|
client := &http.Client{}
|
||||||
req, _ := http.NewRequest("GET", u.String()+urlQuery, nil)
|
req, _ := http.NewRequest("GET", url, nil)
|
||||||
|
|
||||||
if rule.Headers.UserAgent != "" {
|
if rule.Headers.UserAgent != "" {
|
||||||
req.Header.Set("User-Agent", rule.Headers.UserAgent)
|
req.Header.Set("User-Agent", rule.Headers.UserAgent)
|
||||||
@@ -114,6 +223,7 @@ func fetchSite(urlpath string, queries map[string]string) (string, *http.Request
|
|||||||
}
|
}
|
||||||
|
|
||||||
if rule.Headers.CSP != "" {
|
if rule.Headers.CSP != "" {
|
||||||
|
//log.Println(rule.Headers.CSP)
|
||||||
resp.Header.Set("Content-Security-Policy", rule.Headers.CSP)
|
resp.Header.Set("Content-Security-Policy", rule.Headers.CSP)
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -122,7 +232,7 @@ func fetchSite(urlpath string, queries map[string]string) (string, *http.Request
|
|||||||
return body, req, resp, nil
|
return body, req, resp, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
func rewriteHtml(bodyB []byte, u *url.URL, rule Rule) string {
|
func rewriteHtml(bodyB []byte, u *url.URL, rule ruleset.Rule) string {
|
||||||
// Rewrite the HTML
|
// Rewrite the HTML
|
||||||
body := string(bodyB)
|
body := string(bodyB)
|
||||||
|
|
||||||
@@ -156,63 +266,11 @@ func getenv(key, fallback string) string {
|
|||||||
return value
|
return value
|
||||||
}
|
}
|
||||||
|
|
||||||
func loadRules() RuleSet {
|
func fetchRule(domain string, path string) ruleset.Rule {
|
||||||
rulesUrl := os.Getenv("RULESET")
|
|
||||||
if rulesUrl == "" {
|
|
||||||
RulesList := RuleSet{}
|
|
||||||
return RulesList
|
|
||||||
}
|
|
||||||
log.Println("Loading rules")
|
|
||||||
|
|
||||||
var ruleSet RuleSet
|
|
||||||
if strings.HasPrefix(rulesUrl, "http") {
|
|
||||||
|
|
||||||
resp, err := http.Get(rulesUrl)
|
|
||||||
if err != nil {
|
|
||||||
log.Println("ERROR:", err)
|
|
||||||
}
|
|
||||||
defer resp.Body.Close()
|
|
||||||
|
|
||||||
if resp.StatusCode >= 400 {
|
|
||||||
log.Println("ERROR:", resp.StatusCode, rulesUrl)
|
|
||||||
}
|
|
||||||
|
|
||||||
body, err := io.ReadAll(resp.Body)
|
|
||||||
if err != nil {
|
|
||||||
log.Println("ERROR:", err)
|
|
||||||
}
|
|
||||||
yaml.Unmarshal(body, &ruleSet)
|
|
||||||
|
|
||||||
if err != nil {
|
|
||||||
log.Println("ERROR:", err)
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
yamlFile, err := os.ReadFile(rulesUrl)
|
|
||||||
if err != nil {
|
|
||||||
log.Println("ERROR:", err)
|
|
||||||
}
|
|
||||||
yaml.Unmarshal(yamlFile, &ruleSet)
|
|
||||||
}
|
|
||||||
|
|
||||||
domains := []string{}
|
|
||||||
for _, rule := range ruleSet {
|
|
||||||
|
|
||||||
domains = append(domains, rule.Domain)
|
|
||||||
domains = append(domains, rule.Domains...)
|
|
||||||
if os.Getenv("ALLOWED_DOMAINS_RULESET") == "true" {
|
|
||||||
allowedDomains = append(allowedDomains, domains...)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
log.Println("Loaded ", len(ruleSet), " rules for", len(domains), "Domains")
|
|
||||||
return ruleSet
|
|
||||||
}
|
|
||||||
|
|
||||||
func fetchRule(domain string, path string) Rule {
|
|
||||||
if len(rulesSet) == 0 {
|
if len(rulesSet) == 0 {
|
||||||
return Rule{}
|
return ruleset.Rule{}
|
||||||
}
|
}
|
||||||
rule := Rule{}
|
rule := ruleset.Rule{}
|
||||||
for _, rule := range rulesSet {
|
for _, rule := range rulesSet {
|
||||||
domains := rule.Domains
|
domains := rule.Domains
|
||||||
if rule.Domain != "" {
|
if rule.Domain != "" {
|
||||||
@@ -231,7 +289,7 @@ func fetchRule(domain string, path string) Rule {
|
|||||||
return rule
|
return rule
|
||||||
}
|
}
|
||||||
|
|
||||||
func applyRules(body string, rule Rule) string {
|
func applyRules(body string, rule ruleset.Rule) string {
|
||||||
if len(rulesSet) == 0 {
|
if len(rulesSet) == 0 {
|
||||||
return body
|
return body
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -2,6 +2,7 @@
|
|||||||
package handlers
|
package handlers
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"ladder/pkg/ruleset"
|
||||||
"net/http"
|
"net/http"
|
||||||
"net/http/httptest"
|
"net/http/httptest"
|
||||||
"net/url"
|
"net/url"
|
||||||
@@ -13,7 +14,7 @@ import (
|
|||||||
|
|
||||||
func TestProxySite(t *testing.T) {
|
func TestProxySite(t *testing.T) {
|
||||||
app := fiber.New()
|
app := fiber.New()
|
||||||
app.Get("/:url", ProxySite)
|
app.Get("/:url", ProxySite(""))
|
||||||
|
|
||||||
req := httptest.NewRequest("GET", "/https://example.com", nil)
|
req := httptest.NewRequest("GET", "/https://example.com", nil)
|
||||||
resp, err := app.Test(req)
|
resp, err := app.Test(req)
|
||||||
@@ -51,7 +52,7 @@ func TestRewriteHtml(t *testing.T) {
|
|||||||
</html>
|
</html>
|
||||||
`
|
`
|
||||||
|
|
||||||
actual := rewriteHtml(bodyB, u, Rule{})
|
actual := rewriteHtml(bodyB, u, ruleset.Rule{})
|
||||||
assert.Equal(t, expected, actual)
|
assert.Equal(t, expected, actual)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -1,29 +0,0 @@
|
|||||||
package handlers
|
|
||||||
|
|
||||||
type Regex struct {
|
|
||||||
Match string `yaml:"match"`
|
|
||||||
Replace string `yaml:"replace"`
|
|
||||||
}
|
|
||||||
|
|
||||||
type RuleSet []Rule
|
|
||||||
|
|
||||||
type Rule struct {
|
|
||||||
Domain string `yaml:"domain,omitempty"`
|
|
||||||
Domains []string `yaml:"domains,omitempty"`
|
|
||||||
Paths []string `yaml:"paths,omitempty"`
|
|
||||||
Headers struct {
|
|
||||||
UserAgent string `yaml:"user-agent,omitempty"`
|
|
||||||
XForwardedFor string `yaml:"x-forwarded-for,omitempty"`
|
|
||||||
Referer string `yaml:"referer,omitempty"`
|
|
||||||
Cookie string `yaml:"cookie,omitempty"`
|
|
||||||
CSP string `yaml:"content-security-policy,omitempty"`
|
|
||||||
} `yaml:"headers,omitempty"`
|
|
||||||
GoogleCache bool `yaml:"googleCache,omitempty"`
|
|
||||||
RegexRules []Regex `yaml:"regexRules"`
|
|
||||||
Injections []struct {
|
|
||||||
Position string `yaml:"position"`
|
|
||||||
Append string `yaml:"append"`
|
|
||||||
Prepend string `yaml:"prepend"`
|
|
||||||
Replace string `yaml:"replace"`
|
|
||||||
} `yaml:"injections"`
|
|
||||||
}
|
|
||||||
9
package.json
Normal file
9
package.json
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
{
|
||||||
|
"scripts": {
|
||||||
|
"build": "pnpx tailwindcss -i ./styles/input.css -o ./styles/output.css --build && pnpx minify ./styles/output.css > ./cmd/styles.css"
|
||||||
|
},
|
||||||
|
"devDependencies": {
|
||||||
|
"minify": "^10.5.2",
|
||||||
|
"tailwindcss": "^3.3.5"
|
||||||
|
}
|
||||||
|
}
|
||||||
286
pkg/ruleset/ruleset.go
Normal file
286
pkg/ruleset/ruleset.go
Normal file
@@ -0,0 +1,286 @@
|
|||||||
|
package ruleset
|
||||||
|
|
||||||
|
import (
|
||||||
|
"errors"
|
||||||
|
"fmt"
|
||||||
|
"io"
|
||||||
|
"log"
|
||||||
|
"net/http"
|
||||||
|
"os"
|
||||||
|
"path/filepath"
|
||||||
|
"regexp"
|
||||||
|
"strings"
|
||||||
|
|
||||||
|
"compress/gzip"
|
||||||
|
|
||||||
|
"gopkg.in/yaml.v3"
|
||||||
|
)
|
||||||
|
|
||||||
|
type Regex struct {
|
||||||
|
Match string `yaml:"match"`
|
||||||
|
Replace string `yaml:"replace"`
|
||||||
|
}
|
||||||
|
type KV struct {
|
||||||
|
Key string `yaml:"key"`
|
||||||
|
Value string `yaml:"value"`
|
||||||
|
}
|
||||||
|
|
||||||
|
type RuleSet []Rule
|
||||||
|
|
||||||
|
type Rule struct {
|
||||||
|
Domain string `yaml:"domain,omitempty"`
|
||||||
|
Domains []string `yaml:"domains,omitempty"`
|
||||||
|
Paths []string `yaml:"paths,omitempty"`
|
||||||
|
Headers struct {
|
||||||
|
UserAgent string `yaml:"user-agent,omitempty"`
|
||||||
|
XForwardedFor string `yaml:"x-forwarded-for,omitempty"`
|
||||||
|
Referer string `yaml:"referer,omitempty"`
|
||||||
|
Cookie string `yaml:"cookie,omitempty"`
|
||||||
|
CSP string `yaml:"content-security-policy,omitempty"`
|
||||||
|
} `yaml:"headers,omitempty"`
|
||||||
|
GoogleCache bool `yaml:"googleCache,omitempty"`
|
||||||
|
RegexRules []Regex `yaml:"regexRules"`
|
||||||
|
|
||||||
|
UrlMods struct {
|
||||||
|
Domain []Regex `yaml:"domain"`
|
||||||
|
Path []Regex `yaml:"path"`
|
||||||
|
Query []KV `yaml:"query"`
|
||||||
|
} `yaml:"urlMods"`
|
||||||
|
|
||||||
|
Injections []struct {
|
||||||
|
Position string `yaml:"position"`
|
||||||
|
Append string `yaml:"append"`
|
||||||
|
Prepend string `yaml:"prepend"`
|
||||||
|
Replace string `yaml:"replace"`
|
||||||
|
} `yaml:"injections"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewRulesetFromEnv creates a new RuleSet based on the RULESET environment variable.
|
||||||
|
// It logs a warning and returns an empty RuleSet if the RULESET environment variable is not set.
|
||||||
|
// If the RULESET is set but the rules cannot be loaded, it panics.
|
||||||
|
func NewRulesetFromEnv() RuleSet {
|
||||||
|
rulesPath, ok := os.LookupEnv("RULESET")
|
||||||
|
if !ok {
|
||||||
|
log.Printf("WARN: No ruleset specified. Set the `RULESET` environment variable to load one for a better success rate.")
|
||||||
|
return RuleSet{}
|
||||||
|
}
|
||||||
|
ruleSet, err := NewRuleset(rulesPath)
|
||||||
|
if err != nil {
|
||||||
|
log.Println(err)
|
||||||
|
}
|
||||||
|
return ruleSet
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewRuleset loads a RuleSet from a given string of rule paths, separated by semicolons.
|
||||||
|
// It supports loading rules from both local file paths and remote URLs.
|
||||||
|
// Returns a RuleSet and an error if any issues occur during loading.
|
||||||
|
func NewRuleset(rulePaths string) (RuleSet, error) {
|
||||||
|
ruleSet := RuleSet{}
|
||||||
|
errs := []error{}
|
||||||
|
|
||||||
|
rp := strings.Split(rulePaths, ";")
|
||||||
|
var remoteRegex = regexp.MustCompile(`^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()!@:%_\+.~#?&\/\/=]*)`)
|
||||||
|
for _, rule := range rp {
|
||||||
|
rulePath := strings.Trim(rule, " ")
|
||||||
|
var err error
|
||||||
|
|
||||||
|
isRemote := remoteRegex.MatchString(rulePath)
|
||||||
|
if isRemote {
|
||||||
|
err = ruleSet.loadRulesFromRemoteFile(rulePath)
|
||||||
|
} else {
|
||||||
|
err = ruleSet.loadRulesFromLocalDir(rulePath)
|
||||||
|
}
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
e := fmt.Errorf("WARN: failed to load ruleset from '%s'", rulePath)
|
||||||
|
errs = append(errs, errors.Join(e, err))
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if len(errs) != 0 {
|
||||||
|
e := fmt.Errorf("WARN: failed to load %d rulesets", len(rp))
|
||||||
|
errs = append(errs, e)
|
||||||
|
// panic if the user specified a local ruleset, but it wasn't found on disk
|
||||||
|
// don't fail silently
|
||||||
|
for _, err := range errs {
|
||||||
|
if errors.Is(os.ErrNotExist, err) {
|
||||||
|
e := fmt.Errorf("PANIC: ruleset '%s' not found", err)
|
||||||
|
panic(errors.Join(e, err))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// else, bubble up any errors, such as syntax or remote host issues
|
||||||
|
return ruleSet, errors.Join(errs...)
|
||||||
|
}
|
||||||
|
ruleSet.PrintStats()
|
||||||
|
return ruleSet, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// ================== RULESET loading logic ===================================
|
||||||
|
|
||||||
|
// loadRulesFromLocalDir loads rules from a local directory specified by the path.
|
||||||
|
// It walks through the directory, loading rules from YAML files.
|
||||||
|
// Returns an error if the directory cannot be accessed
|
||||||
|
// If there is an issue loading any file, it will be skipped
|
||||||
|
func (rs *RuleSet) loadRulesFromLocalDir(path string) error {
|
||||||
|
_, err := os.Stat(path)
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
yamlRegex := regexp.MustCompile(`.*\.ya?ml`)
|
||||||
|
|
||||||
|
err = filepath.Walk(path, func(path string, info os.FileInfo, err error) error {
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
if info.IsDir() {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
if isYaml := yamlRegex.MatchString(path); !isYaml {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
err = rs.loadRulesFromLocalFile(path)
|
||||||
|
if err != nil {
|
||||||
|
log.Printf("WARN: failed to load directory ruleset '%s': %s, skipping", path, err)
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
log.Printf("INFO: loaded ruleset %s\n", path)
|
||||||
|
return nil
|
||||||
|
})
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// loadRulesFromLocalFile loads rules from a local YAML file specified by the path.
|
||||||
|
// Returns an error if the file cannot be read or if there's a syntax error in the YAML.
|
||||||
|
func (rs *RuleSet) loadRulesFromLocalFile(path string) error {
|
||||||
|
yamlFile, err := os.ReadFile(path)
|
||||||
|
if err != nil {
|
||||||
|
e := fmt.Errorf("failed to read rules from local file: '%s'", path)
|
||||||
|
return errors.Join(e, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
var r RuleSet
|
||||||
|
err = yaml.Unmarshal(yamlFile, &r)
|
||||||
|
if err != nil {
|
||||||
|
e := fmt.Errorf("failed to load rules from local file, possible syntax error in '%s'", path)
|
||||||
|
ee := errors.Join(e, err)
|
||||||
|
if _, ok := os.LookupEnv("DEBUG"); ok {
|
||||||
|
debugPrintRule(string(yamlFile), ee)
|
||||||
|
}
|
||||||
|
return ee
|
||||||
|
}
|
||||||
|
*rs = append(*rs, r...)
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// loadRulesFromRemoteFile loads rules from a remote URL.
|
||||||
|
// It supports plain and gzip compressed content.
|
||||||
|
// Returns an error if there's an issue accessing the URL or if there's a syntax error in the YAML.
|
||||||
|
func (rs *RuleSet) loadRulesFromRemoteFile(rulesUrl string) error {
|
||||||
|
var r RuleSet
|
||||||
|
resp, err := http.Get(rulesUrl)
|
||||||
|
if err != nil {
|
||||||
|
e := fmt.Errorf("failed to load rules from remote url '%s'", rulesUrl)
|
||||||
|
return errors.Join(e, err)
|
||||||
|
}
|
||||||
|
defer resp.Body.Close()
|
||||||
|
|
||||||
|
if resp.StatusCode >= 400 {
|
||||||
|
e := fmt.Errorf("failed to load rules from remote url (%s) on '%s'", resp.Status, rulesUrl)
|
||||||
|
return errors.Join(e, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
var reader io.Reader
|
||||||
|
isGzip := strings.HasSuffix(rulesUrl, ".gz") || strings.HasSuffix(rulesUrl, ".gzip") || resp.Header.Get("content-encoding") == "gzip"
|
||||||
|
|
||||||
|
if isGzip {
|
||||||
|
reader, err = gzip.NewReader(resp.Body)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to create gzip reader for URL '%s' with status code '%s': %w", rulesUrl, resp.Status, err)
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
reader = resp.Body
|
||||||
|
}
|
||||||
|
|
||||||
|
err = yaml.NewDecoder(reader).Decode(&r)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
e := fmt.Errorf("failed to load rules from remote url '%s' with status code '%s' and possible syntax error", rulesUrl, resp.Status)
|
||||||
|
ee := errors.Join(e, err)
|
||||||
|
return ee
|
||||||
|
}
|
||||||
|
|
||||||
|
*rs = append(*rs, r...)
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// ================= utility methods ==========================
|
||||||
|
|
||||||
|
// Yaml returns the ruleset as a Yaml string
|
||||||
|
func (rs *RuleSet) Yaml() (string, error) {
|
||||||
|
y, err := yaml.Marshal(rs)
|
||||||
|
if err != nil {
|
||||||
|
return "", err
|
||||||
|
}
|
||||||
|
return string(y), nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// GzipYaml returns an io.Reader that streams the Gzip-compressed YAML representation of the RuleSet.
|
||||||
|
func (rs *RuleSet) GzipYaml() (io.Reader, error) {
|
||||||
|
pr, pw := io.Pipe()
|
||||||
|
|
||||||
|
go func() {
|
||||||
|
defer pw.Close()
|
||||||
|
|
||||||
|
gw := gzip.NewWriter(pw)
|
||||||
|
defer gw.Close()
|
||||||
|
|
||||||
|
if err := yaml.NewEncoder(gw).Encode(rs); err != nil {
|
||||||
|
gw.Close() // Ensure to close the gzip writer
|
||||||
|
pw.CloseWithError(err)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
}()
|
||||||
|
|
||||||
|
return pr, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// Domains extracts and returns a slice of all domains present in the RuleSet.
|
||||||
|
func (rs *RuleSet) Domains() []string {
|
||||||
|
var domains []string
|
||||||
|
for _, rule := range *rs {
|
||||||
|
domains = append(domains, rule.Domain)
|
||||||
|
domains = append(domains, rule.Domains...)
|
||||||
|
}
|
||||||
|
return domains
|
||||||
|
}
|
||||||
|
|
||||||
|
// DomainCount returns the count of unique domains present in the RuleSet.
|
||||||
|
func (rs *RuleSet) DomainCount() int {
|
||||||
|
return len(rs.Domains())
|
||||||
|
}
|
||||||
|
|
||||||
|
// Count returns the total number of rules in the RuleSet.
|
||||||
|
func (rs *RuleSet) Count() int {
|
||||||
|
return len(*rs)
|
||||||
|
}
|
||||||
|
|
||||||
|
// PrintStats logs the number of rules and domains loaded in the RuleSet.
|
||||||
|
func (rs *RuleSet) PrintStats() {
|
||||||
|
log.Printf("INFO: Loaded %d rules for %d domains\n", rs.Count(), rs.DomainCount())
|
||||||
|
}
|
||||||
|
|
||||||
|
// debugPrintRule is a utility function for printing a rule and associated error for debugging purposes.
|
||||||
|
func debugPrintRule(rule string, err error) {
|
||||||
|
fmt.Println("------------------------------ BEGIN DEBUG RULESET -----------------------------")
|
||||||
|
fmt.Printf("%s\n", err.Error())
|
||||||
|
fmt.Println("--------------------------------------------------------------------------------")
|
||||||
|
fmt.Println(rule)
|
||||||
|
fmt.Println("------------------------------ END DEBUG RULESET -------------------------------")
|
||||||
|
}
|
||||||
153
pkg/ruleset/ruleset_test.go
Normal file
153
pkg/ruleset/ruleset_test.go
Normal file
@@ -0,0 +1,153 @@
|
|||||||
|
package ruleset
|
||||||
|
|
||||||
|
import (
|
||||||
|
"os"
|
||||||
|
"path/filepath"
|
||||||
|
"testing"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/gofiber/fiber/v2"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
)
|
||||||
|
|
||||||
|
var (
|
||||||
|
validYAML = `
|
||||||
|
- domain: example.com
|
||||||
|
regexRules:
|
||||||
|
- match: "^http:"
|
||||||
|
replace: "https:"`
|
||||||
|
|
||||||
|
invalidYAML = `
|
||||||
|
- domain: [thisIsATestYamlThatIsMeantToFail.example]
|
||||||
|
regexRules:
|
||||||
|
- match: "^http:"
|
||||||
|
replace: "https:"
|
||||||
|
- match: "[incomplete"`
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestLoadRulesFromRemoteFile(t *testing.T) {
|
||||||
|
app := fiber.New()
|
||||||
|
defer app.Shutdown()
|
||||||
|
|
||||||
|
app.Get("/valid-config.yml", func(c *fiber.Ctx) error {
|
||||||
|
c.SendString(validYAML)
|
||||||
|
return nil
|
||||||
|
})
|
||||||
|
app.Get("/invalid-config.yml", func(c *fiber.Ctx) error {
|
||||||
|
c.SendString(invalidYAML)
|
||||||
|
return nil
|
||||||
|
})
|
||||||
|
|
||||||
|
app.Get("/valid-config.gz", func(c *fiber.Ctx) error {
|
||||||
|
c.Set("Content-Type", "application/octet-stream")
|
||||||
|
rs, err := loadRuleFromString(validYAML)
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("failed to load valid yaml from string: %s", err.Error())
|
||||||
|
}
|
||||||
|
s, err := rs.GzipYaml()
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("failed to load gzip serialize yaml: %s", err.Error())
|
||||||
|
}
|
||||||
|
|
||||||
|
err = c.SendStream(s)
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("failed to stream gzip serialized yaml: %s", err.Error())
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
})
|
||||||
|
|
||||||
|
// Start the server in a goroutine
|
||||||
|
go func() {
|
||||||
|
if err := app.Listen("127.0.0.1:9999"); err != nil {
|
||||||
|
t.Errorf("Server failed to start: %s", err.Error())
|
||||||
|
}
|
||||||
|
}()
|
||||||
|
|
||||||
|
// Wait for the server to start
|
||||||
|
time.Sleep(time.Second * 1)
|
||||||
|
|
||||||
|
rs, err := NewRuleset("http://127.0.0.1:9999/valid-config.yml")
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("failed to load plaintext ruleset from http server: %s", err.Error())
|
||||||
|
}
|
||||||
|
assert.Equal(t, rs[0].Domain, "example.com")
|
||||||
|
|
||||||
|
rs, err = NewRuleset("http://127.0.0.1:9999/valid-config.gz")
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("failed to load gzipped ruleset from http server: %s", err.Error())
|
||||||
|
}
|
||||||
|
assert.Equal(t, rs[0].Domain, "example.com")
|
||||||
|
|
||||||
|
os.Setenv("RULESET", "http://127.0.0.1:9999/valid-config.gz")
|
||||||
|
rs = NewRulesetFromEnv()
|
||||||
|
if !assert.Equal(t, rs[0].Domain, "example.com") {
|
||||||
|
t.Error("expected no errors loading ruleset from gzip url using environment variable, but got one")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func loadRuleFromString(yaml string) (RuleSet, error) {
|
||||||
|
// Create a temporary file and load it
|
||||||
|
tmpFile, _ := os.CreateTemp("", "ruleset*.yaml")
|
||||||
|
defer os.Remove(tmpFile.Name())
|
||||||
|
tmpFile.WriteString(yaml)
|
||||||
|
rs := RuleSet{}
|
||||||
|
err := rs.loadRulesFromLocalFile(tmpFile.Name())
|
||||||
|
return rs, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// TestLoadRulesFromLocalFile tests the loading of rules from a local YAML file.
|
||||||
|
func TestLoadRulesFromLocalFile(t *testing.T) {
|
||||||
|
rs, err := loadRuleFromString(validYAML)
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("Failed to load rules from valid YAML: %s", err)
|
||||||
|
}
|
||||||
|
assert.Equal(t, rs[0].Domain, "example.com")
|
||||||
|
assert.Equal(t, rs[0].RegexRules[0].Match, "^http:")
|
||||||
|
assert.Equal(t, rs[0].RegexRules[0].Replace, "https:")
|
||||||
|
|
||||||
|
_, err = loadRuleFromString(invalidYAML)
|
||||||
|
if err == nil {
|
||||||
|
t.Errorf("Expected an error when loading invalid YAML, but got none")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// TestLoadRulesFromLocalDir tests the loading of rules from a local nested directory full of yaml rulesets
|
||||||
|
func TestLoadRulesFromLocalDir(t *testing.T) {
|
||||||
|
// Create a temporary directory
|
||||||
|
baseDir, err := os.MkdirTemp("", "ruleset_test")
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("Failed to create temporary directory: %s", err)
|
||||||
|
}
|
||||||
|
defer os.RemoveAll(baseDir)
|
||||||
|
|
||||||
|
// Create a nested subdirectory
|
||||||
|
nestedDir := filepath.Join(baseDir, "nested")
|
||||||
|
err = os.Mkdir(nestedDir, 0755)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("Failed to create nested directory: %s", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create a nested subdirectory
|
||||||
|
nestedTwiceDir := filepath.Join(nestedDir, "nestedTwice")
|
||||||
|
err = os.Mkdir(nestedTwiceDir, 0755)
|
||||||
|
|
||||||
|
testCases := []string{"test.yaml", "test2.yaml", "test-3.yaml", "test 4.yaml", "1987.test.yaml.yml", "foobar.example.com.yaml", "foobar.com.yml"}
|
||||||
|
for _, fileName := range testCases {
|
||||||
|
filePath := filepath.Join(nestedDir, "2x-"+fileName)
|
||||||
|
os.WriteFile(filePath, []byte(validYAML), 0644)
|
||||||
|
filePath = filepath.Join(nestedDir, fileName)
|
||||||
|
os.WriteFile(filePath, []byte(validYAML), 0644)
|
||||||
|
filePath = filepath.Join(baseDir, "base-"+fileName)
|
||||||
|
os.WriteFile(filePath, []byte(validYAML), 0644)
|
||||||
|
}
|
||||||
|
rs := RuleSet{}
|
||||||
|
err = rs.loadRulesFromLocalDir(baseDir)
|
||||||
|
assert.NoError(t, err)
|
||||||
|
assert.Equal(t, rs.Count(), len(testCases)*3)
|
||||||
|
|
||||||
|
for _, rule := range rs {
|
||||||
|
assert.Equal(t, rule.Domain, "example.com")
|
||||||
|
assert.Equal(t, rule.RegexRules[0].Match, "^http:")
|
||||||
|
assert.Equal(t, rule.RegexRules[0].Replace, "https:")
|
||||||
|
}
|
||||||
|
}
|
||||||
29
ruleset.yaml
29
ruleset.yaml
@@ -163,3 +163,32 @@
|
|||||||
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
|
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
|
||||||
content-security-policy: script-src 'self';
|
content-security-policy: script-src 'self';
|
||||||
cookie:
|
cookie:
|
||||||
|
- domain: tagesspiegel.de
|
||||||
|
headers:
|
||||||
|
content-security-policy: script-src 'self';
|
||||||
|
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
|
||||||
|
urlMods:
|
||||||
|
query:
|
||||||
|
- key: amp
|
||||||
|
value: 1
|
||||||
|
- domain: www.ft.com
|
||||||
|
headers:
|
||||||
|
referer: https://t.co/x?amp=1
|
||||||
|
injections:
|
||||||
|
- position: head
|
||||||
|
append: |
|
||||||
|
<script>
|
||||||
|
document.addEventListener("DOMContentLoaded", () => {
|
||||||
|
const styleTags = document.querySelectorAll('link[rel="stylesheet"]');
|
||||||
|
styleTags.forEach(el => {
|
||||||
|
const href = el.getAttribute('href').substring(1);
|
||||||
|
const updatedHref = href.replace(/(https?:\/\/.+?)\/{2,}/, '$1/');
|
||||||
|
el.setAttribute('href', updatedHref);
|
||||||
|
});
|
||||||
|
setTimeout(() => {
|
||||||
|
const cookie = document.querySelectorAll('.o-cookie-message, .js-article-ribbon, .o-ads, .o-banner, .o-message, .article__content-sign-up');
|
||||||
|
cookie.forEach(el => { el.remove(); });
|
||||||
|
}, 1000);
|
||||||
|
})
|
||||||
|
</script>
|
||||||
|
|
||||||
|
|||||||
3
styles/input.css
Normal file
3
styles/input.css
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
@tailwind base;
|
||||||
|
@tailwind components;
|
||||||
|
@tailwind utilities;
|
||||||
9
tailwind.config.js
Normal file
9
tailwind.config.js
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
/** @type {import('tailwindcss').Config} */
|
||||||
|
module.exports = {
|
||||||
|
content: ["./handlers/**/*.html"],
|
||||||
|
theme: {
|
||||||
|
extend: {},
|
||||||
|
},
|
||||||
|
plugins: [],
|
||||||
|
}
|
||||||
|
|
||||||
Reference in New Issue
Block a user