11 Commits

Author SHA1 Message Date
Gianni Carafa
8d1554e10e fix buildjob 2023-11-10 17:31:00 +01:00
Gianni Carafa
ff5bb61891 Merge branch 'main' of github.com:kubero-dev/paywall-ladder 2023-11-10 15:57:48 +01:00
Gianni Carafa
936b418b00 move to new organisation 2023-11-10 15:57:29 +01:00
Gianni C
ac44f12d85 Merge pull request #21 from joncrangle/csp-override
Allow the user to specify the Content Security Policy for a domain
2023-11-10 09:04:54 +01:00
joncrangle
b6f0c644f8 Undo prettier 2023-11-09 22:05:44 -05:00
joncrangle
66c4b3c911 Undo prettier 2023-11-09 22:03:37 -05:00
joncrangle
924696c015 Enable user to define their own content-security-policy 2023-11-09 21:50:46 -05:00
Gianni Carafa
81aa00c2ea include all subdomains 2023-11-09 23:51:36 +01:00
Gianni Carafa
6c1f58e2e7 Add headers field in ruleset. Enable Google Cache. 2023-11-09 23:32:43 +01:00
Gianni Carafa
d3c995df34 unblur main image 2023-11-09 12:09:09 +01:00
Gianni Carafa
6f4a2daeca fix LOG_URLS feature 2023-11-09 09:37:46 +01:00
12 changed files with 169 additions and 83 deletions

View File

@@ -37,4 +37,4 @@ jobs:
args: release --clean
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GORELEASER_GITHUB_TOKEN: ${{ secrets.GORELEASER_GITHUB_TOKEN }}
# GORELEASER_GITHUB_TOKEN: ${{ secrets.GORELEASER_GITHUB_TOKEN }}

View File

@@ -33,10 +33,10 @@ changelog:
#brews:
# -
# repository:
# owner: kubero-dev
# owner: everywall
# name: homebrew-ladder
# token: "{{ .Env.GORELEASER_GITHUB_TOKEN }}"
# homepage: "https://www.kubero.dev"
# description: "Manage your kubero applications with the CLI"
# homepage: "https://www.everyladder.dev"
# description: "Manage your everyladder applications modify every website"
# test: |
# system "#{bin}/kubero", "--version"
# system "#{bin}/everyladder", "--version"

View File

@@ -3,7 +3,7 @@
</p>
<h1 align="center">Ladder</h1>
<div><img alt="License" src="https://img.shields.io/github/license/kubero-dev/ladder"> <img alt="go.mod Go version " src="https://img.shields.io/github/go-mod/go-version/kubero-dev/ladder"> <img alt="GitHub tag (with filter)" src="https://img.shields.io/github/v/tag/kubero-dev/ladder"> <img alt="GitHub (Pre-)Release Date" src="https://img.shields.io/github/release-date-pre/kubero-dev/ladder"> <img alt="GitHub Downloads all releases" src="https://img.shields.io/github/downloads/kubero-dev/ladder/total"> <img alt="GitHub Build Status (with event)" src="https://img.shields.io/github/actions/workflow/status/kubero-dev/ladder/release-binaries.yaml"></div>
<div><img alt="License" src="https://img.shields.io/github/license/everywall/ladder"> <img alt="go.mod Go version " src="https://img.shields.io/github/go-mod/go-version/everywall/ladder"> <img alt="GitHub tag (with filter)" src="https://img.shields.io/github/v/tag/everywall/ladder"> <img alt="GitHub (Pre-)Release Date" src="https://img.shields.io/github/release-date-pre/everywall/ladder"> <img alt="GitHub Downloads all releases" src="https://img.shields.io/github/downloads/everywall/ladder/total"> <img alt="GitHub Build Status (with event)" src="https://img.shields.io/github/actions/workflow/status/everywall/ladder/release-binaries.yaml"></div>
*Ladder is a web proxy to help bypass paywalls.* This is a selfhosted version of [1ft.io](https://1ft.io) and [12ft.io](https://12ft.io). It is inspired by [13ft](https://github.com/wasi-master/13ft).
@@ -23,7 +23,7 @@ Freedom of information is an essential pillar of democracy and informed decision
- [x] Fetch RAW HTML
- [x] Custom User Agent
- [x] Custom X-Forwarded-For IP
- [x] [Docker container](https://github.com/kubero-dev/ladder/pkgs/container/ladder) (amd64, arm64)
- [x] [Docker container](https://github.com/everywall/ladder/pkgs/container/ladder) (amd64, arm64)
- [x] Linux binary
- [x] Mac OS binary
- [x] Windows binary (untested)
@@ -47,18 +47,18 @@ Some sites do not expose their content to search engines, which means that the p
> **Warning:** If your instance will be publicly accessible, make sure to enable Basic Auth. This will prevent unauthorized users from using your proxy. If you do not enable Basic Auth, anyone can use your proxy to browse nasty/illegal stuff. And you will be responsible for it.
### Binary
1) Download binary [here](https://github.com/kubero-dev/ladder/releases/latest)
1) Download binary [here](https://github.com/everywall/ladder/releases/latest)
2) Unpack and run the binary `./ladder`
3) Open Browser (Default: http://localhost:8080)
### Docker
```bash
docker run -p 8080:8080 -d --name ladder ghcr.io/kubero-dev/ladder:latest
docker run -p 8080:8080 -d --name ladder ghcr.io/everywall/ladder:latest
```
### Docker Compose
```bash
curl https://raw.githubusercontent.com/kubero-dev/ladder/main/docker-compose.yaml --output docker-compose.yaml
curl https://raw.githubusercontent.com/everywall/ladder/main/docker-compose.yaml --output docker-compose.yaml
docker-compose up -d
```
@@ -106,7 +106,7 @@ http://localhost:8080/ruleset
| `LOG_URLS` | Log fetched URL's | `true` |
| `DISABLE_FORM` | Disables URL Form Frontpage | `false` |
| `FORM_PATH` | Path to custom Form HTML | `` |
| `RULESET` | URL to a ruleset file | `https://raw.githubusercontent.com/kubero-dev/ladder/main/ruleset.yaml` or `/path/to/my/rules.yaml` |
| `RULESET` | URL to a ruleset file | `https://raw.githubusercontent.com/everywall/ladder/main/ruleset.yaml` or `/path/to/my/rules.yaml` |
| `EXPOSE_RULESET` | Make your Ruleset available to other ladders | `true` |
| `ALLOWED_DOMAINS` | Comma separated list of allowed domains. Empty = no limitations | `` |
| `ALLOWED_DOMAINS_RULESET` | Allow Domains from Ruleset. false = no limitations | `false` |
@@ -120,10 +120,16 @@ It is possible to apply custom rules to modify the response. This can be used to
See in [ruleset.yaml](ruleset.yaml) for an example.
```yaml
- domain: www.example.com
- domain: example.com # Inbcludes all subdomains
domains: # Additional domains to apply the rule
- www.example.com
- www.example.de
- www.beispiel.de
headers:
x-forwarded-for: none # override X-Forwarded-For header or delete with none
referer: none # override Referer header or delete with none
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
content-security-policy: script-src 'self'; # override response header
cookie: privacy=1
regexRules:
- match: <script\s+([^>]*\s+)?src="(/)([^"]*)"
replace: <script $1 script="/https://www.example.com/$3"
@@ -138,7 +144,7 @@ See in [ruleset.yaml](ruleset.yaml) for an example.
- domain: www.anotherdomain.com # Domain where the rule applies
paths: # Paths where the rule applies
- /article
googleCache: false # Search also in Google Cache
googleCache: false # Use Google Cache to fetch the content
regexRules: # Regex rules to apply
- match: <script\s+([^>]*\s+)?src="(/)([^"]*)"
replace: <script $1 script="/https://www.example.com/$3"

View File

@@ -47,6 +47,7 @@ func main() {
app := fiber.New(
fiber.Config{
Prefork: *prefork,
GETOnly: true,
},
)

View File

@@ -1,7 +1,7 @@
version: '3'
services:
ladder:
image: ghcr.io/kubero-dev/ladder:latest
image: ghcr.io/everywall/ladder:latest
container_name: ladder
#build: .
#restart: always

View File

@@ -36,7 +36,7 @@
</style>
</head>
<body>
<a href="https://github.com/kubero-dev/ladder">
<a href="https://github.com/everywall/ladder">
<div class="github-corner" aria-label="View source on GitHub">
<svg
xmlns:svg="http://www.w3.org/2000/svg"

View File

@@ -33,6 +33,8 @@ func ProxySite(c *fiber.Ctx) error {
}
c.Set("Content-Type", resp.Header.Get("Content-Type"))
c.Set("Content-Security-Policy", resp.Header.Get("Content-Security-Policy"))
return c.SendString(body)
}
@@ -56,17 +58,49 @@ func fetchSite(urlpath string, queries map[string]string) (string, *http.Request
return "", nil, nil, fmt.Errorf("domain not allowed. %s not in %s", u.Host, allowedDomains)
}
if os.Getenv("LOG_URLS ") == "true" {
if os.Getenv("LOG_URLS") == "true" {
log.Println(u.String() + urlQuery)
}
rule := fetchRule(u.Host, u.Path)
if rule.GoogleCache {
u, err = url.Parse("https://webcache.googleusercontent.com/search?q=cache:" + u.String())
if err != nil {
return "", nil, nil, err
}
}
// Fetch the site
client := &http.Client{}
req, _ := http.NewRequest("GET", u.String()+urlQuery, nil)
req.Header.Set("User-Agent", UserAgent)
req.Header.Set("X-Forwarded-For", ForwardedFor)
req.Header.Set("Referer", u.String())
req.Header.Set("Host", u.Host)
if rule.Headers.UserAgent != "" {
req.Header.Set("User-Agent", rule.Headers.UserAgent)
} else {
req.Header.Set("User-Agent", UserAgent)
}
if rule.Headers.XForwardedFor != "" {
if rule.Headers.XForwardedFor != "none" {
req.Header.Set("X-Forwarded-For", rule.Headers.XForwardedFor)
}
} else {
req.Header.Set("X-Forwarded-For", ForwardedFor)
}
if rule.Headers.Referer != "" {
if rule.Headers.Referer != "none" {
req.Header.Set("Referer", rule.Headers.Referer)
}
} else {
req.Header.Set("Referer", u.String())
}
if rule.Headers.Cookie != "" {
req.Header.Set("Cookie", rule.Headers.Cookie)
}
resp, err := client.Do(req)
if err != nil {
@@ -79,11 +113,16 @@ func fetchSite(urlpath string, queries map[string]string) (string, *http.Request
return "", nil, nil, err
}
body := rewriteHtml(bodyB, u)
if rule.Headers.CSP != "" {
resp.Header.Set("Content-Security-Policy", rule.Headers.CSP)
}
log.Print("rule", rule)
body := rewriteHtml(bodyB, u, rule)
return body, req, resp, nil
}
func rewriteHtml(bodyB []byte, u *url.URL) string {
func rewriteHtml(bodyB []byte, u *url.URL, rule Rule) string {
// Rewrite the HTML
body := string(bodyB)
@@ -104,7 +143,7 @@ func rewriteHtml(bodyB []byte, u *url.URL) string {
body = strings.ReplaceAll(body, "href=\"https://"+u.Host, "href=\"/https://"+u.Host+"/")
if os.Getenv("RULESET") != "" {
body = applyRules(u.Host, u.Path, body)
body = applyRules(body, rule)
}
return body
}
@@ -169,69 +208,61 @@ func loadRules() RuleSet {
return ruleSet
}
func applyRules(domain string, path string, body string) string {
func fetchRule(domain string, path string) Rule {
if len(rulesSet) == 0 {
return Rule{}
}
rule := Rule{}
for _, rule := range rulesSet {
domains := rule.Domains
if rule.Domain != "" {
domains = append(domains, rule.Domain)
}
for _, ruleDomain := range domains {
if ruleDomain == domain || strings.HasSuffix(domain, ruleDomain) {
if len(rule.Paths) > 0 && !StringInSlice(path, rule.Paths) {
continue
}
// return first match
return rule
}
}
}
return rule
}
func applyRules(body string, rule Rule) string {
if len(rulesSet) == 0 {
return body
}
for _, rule := range rulesSet {
domains := rule.Domains
domains = append(domains, rule.Domain)
for _, ruleDomain := range domains {
if ruleDomain != domain {
continue
}
if len(rule.Paths) > 0 && !StringInSlice(path, rule.Paths) {
continue
}
for _, regexRule := range rule.RegexRules {
re := regexp.MustCompile(regexRule.Match)
body = re.ReplaceAllString(body, regexRule.Replace)
}
for _, injection := range rule.Injections {
doc, err := goquery.NewDocumentFromReader(strings.NewReader(body))
if err != nil {
log.Fatal(err)
}
if injection.Replace != "" {
doc.Find(injection.Position).ReplaceWithHtml(injection.Replace)
}
if injection.Append != "" {
doc.Find(injection.Position).AppendHtml(injection.Append)
}
if injection.Prepend != "" {
doc.Find(injection.Position).PrependHtml(injection.Prepend)
}
body, err = doc.Html()
if err != nil {
log.Fatal(err)
}
}
for _, regexRule := range rule.RegexRules {
re := regexp.MustCompile(regexRule.Match)
body = re.ReplaceAllString(body, regexRule.Replace)
}
for _, injection := range rule.Injections {
doc, err := goquery.NewDocumentFromReader(strings.NewReader(body))
if err != nil {
log.Fatal(err)
}
if injection.Replace != "" {
doc.Find(injection.Position).ReplaceWithHtml(injection.Replace)
}
if injection.Append != "" {
doc.Find(injection.Position).AppendHtml(injection.Append)
}
if injection.Prepend != "" {
doc.Find(injection.Position).PrependHtml(injection.Prepend)
}
body, err = doc.Html()
if err != nil {
log.Fatal(err)
}
}
return body
}
type Rule struct {
Match string `yaml:"match"`
Replace string `yaml:"replace"`
}
type RuleSet []struct {
Domain string `yaml:"domain"`
Domains []string `yaml:"domains,omitempty"`
Paths []string `yaml:"paths,omitempty"`
GoogleCache bool `yaml:"googleCache,omitempty"`
RegexRules []Rule `yaml:"regexRules"`
Injections []struct {
Position string `yaml:"position"`
Append string `yaml:"append"`
Prepend string `yaml:"prepend"`
Replace string `yaml:"replace"`
} `yaml:"injections"`
}
func StringInSlice(s string, list []string) bool {
for _, x := range list {
if strings.HasPrefix(s, x) {

View File

@@ -51,7 +51,7 @@ func TestRewriteHtml(t *testing.T) {
</html>
`
actual := rewriteHtml(bodyB, u)
actual := rewriteHtml(bodyB, u, Rule{})
assert.Equal(t, expected, actual)
}

29
handlers/types.go Normal file
View File

@@ -0,0 +1,29 @@
package handlers
type Regex struct {
Match string `yaml:"match"`
Replace string `yaml:"replace"`
}
type RuleSet []Rule
type Rule struct {
Domain string `yaml:"domain,omitempty"`
Domains []string `yaml:"domains,omitempty"`
Paths []string `yaml:"paths,omitempty"`
Headers struct {
UserAgent string `yaml:"user-agent,omitempty"`
XForwardedFor string `yaml:"x-forwarded-for,omitempty"`
Referer string `yaml:"referer,omitempty"`
Cookie string `yaml:"cookie,omitempty"`
CSP string `yaml:"content-security-policy,omitempty"`
} `yaml:"headers,omitempty"`
GoogleCache bool `yaml:"googleCache,omitempty"`
RegexRules []Regex `yaml:"regexRules"`
Injections []struct {
Position string `yaml:"position"`
Append string `yaml:"append"`
Prepend string `yaml:"prepend"`
Replace string `yaml:"replace"`
} `yaml:"injections"`
}

View File

@@ -1,6 +1,6 @@
apiVersion: v2
name: ladder
description: A helm chart to deploy kubero-dev/ladder
description: A helm chart to deploy everywall/ladder
type: application
version: "1.0"
appVersion: "v0.0.11"

View File

@@ -1,5 +1,5 @@
image:
RELEASE: ghcr.io/kubero-dev/ladder:v0.0.11
RELEASE: ghcr.io/everywall/ladder:latest
env:
PORT: 8080
@@ -10,7 +10,7 @@ env:
LOG_URLS: "true"
DISABLE_FORM: "false"
FORM_PATH: ""
RULESET: "https://raw.githubusercontent.com/kubero-dev/ladder/main/ruleset.yaml"
RULESET: "https://raw.githubusercontent.com/everywall/ladder/main/ruleset.yaml"
EXPOSE_RULESET: "true"
ALLOWED_DOMAINS: ""
ALLOWED_DOMAINS_RULESET: "false"

View File

@@ -1,6 +1,12 @@
- domain: www.example.com
- domain: example.com
domains:
- www.beispiel.com
- www.beispiel.de
googleCache: true
headers:
x-forwarded-for: none
referer: none
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
cookie: privacy=1
regexRules:
- match: <script\s+([^>]*\s+)?src="(/)([^"]*)"
replace: <script $1 script="/https://www.example.com/$3"
@@ -77,6 +83,10 @@
- domains:
- www.nytimes.com
- www.time.com
headers:
ueser-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
cookie: nyt-a=; nyt-gdpr=0; nyt-geo=DE; nyt-privacy=1
referer: https://www.google.com/
injections:
- position: head
append: |
@@ -142,5 +152,14 @@
paywall.forEach(el => { el.remove(); });
const images = document.querySelectorAll('img');
images.forEach(image => { image.parentElement.style.filter = ''; });
const headimage = document.querySelectorAll('div .aspect-custom');
headimage.forEach(image => { image.style.filter = ''; });
});
</script>
- domain: medium.com
headers:
referer: https://t.co/x?amp=1
x-forwarded-for: none
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
content-security-policy: script-src 'self';
cookie: