Skip to content

Commit

Permalink
V0.3.3 (#37)
Browse files Browse the repository at this point in the history
* Change rule and revert

- change rule to 93 limit to not be inclusive
- revert prior to making fs loading changes to instead investigate extending the bypass flag for stdin

* Test: Read buffer

testing read buffer implementation to increase reading speed of large files and seeing how memory could be optimized in scenarios

* Test: Update buffer to 2 GB

after testing between 1 GB and 5 GB there doesn't seem to be a lot of difference past 2GB estimated

* Test: Update buffer size

Seems like the unused buffer is freed pretty quickly so having more only helps more with large files. This implementation is faster than the original in all cases just fine-tuning the default buffer size at either 2GB or 4GB.

Leaning towards 4GB because there have been examples of almost 30 second faster times than the 2GB buffer and I would expect users to use -f on a system with at least 8 GB of RAM.

* Add new regram mode

added a new mode called regram
  • Loading branch information
JakeWnuk authored Sep 17, 2024
1 parent 649811a commit 02c0eba
Show file tree
Hide file tree
Showing 7 changed files with 190 additions and 37 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ git clone https://github.com/JakeWnuk/ptt && cd ptt && docker build -t ptt . &&

### Usage:
```
Usage of Password Transformation Tool (ptt) version (0.3.2):
Usage of Password Transformation Tool (ptt) version (0.3.3):
ptt [options] [...]
Accepts standard input and/or additonal arguments.
Expand All @@ -64,7 +64,7 @@ These modify or filter the transformation mode.
-m int
Minimum numerical frequency to include in output.
-n int
Maximum number of items to return in output.
Maximum number of items to return in output.
-o string
Output to JSON file in addition to stdout.
-p int
Expand All @@ -87,7 +87,7 @@ These modify or filter the transformation mode.
-vvv
Show verbose statistics output when possible.
-w int
Number of words to generate for passphrases if applicable.
Number of words to use for a transformation if applicable.
-------------------------------------------------------------------------------------------------------------
Transformation Modes:
These create or alter based on the selected mode.
Expand All @@ -114,6 +114,8 @@ These create or alter based on the selected mode.
Transforms input by swapping tokens from a partial mask file and a input file.
-t passphrase -w [words] -tf [file]
Transforms input by randomly generating passphrases with a given number of words and separators from a file.
-t regram -w [words]
Transforms input by 'regramming' sentences into new n-grams with a given number of words.
-t replace-all -tf [file]
Transforms input by replacing all strings with all matches from a ':' separated file.
-t rule-append
Expand Down
75 changes: 55 additions & 20 deletions docs/USAGE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Password Transformation Tool (PTT) Usage Guide
## Version 0.3.0
## Version 0.3.3

### Table of Contents
#### Getting Started
Expand Down Expand Up @@ -37,6 +37,7 @@
2. [Encoding and Decoding](#encoding-and-decoding)
3. [Hex and Dehex](#hex-and-dehex)
4. [Substrings](#substrings)
5. [Regram](#regram)

## Getting Started

Expand Down Expand Up @@ -112,25 +113,46 @@ their collective values combined. The rest of the flags can only be used once.
These flags work with files and directories.

#### Options:
- `-b`: Bypass map creation and use stdout as primary output.
- `-d`: Enable debug mode with verbosity levels [0-2].
- `-f`: Read additional files for input.
- `-i`: Starting index for transformations if applicable. Accepts ranges separated by '-'.
- `-k`: Only keep items in a file.
- `-l`: Only output items of a certain length (does not adjust for rules). Accepts ranges separated by '-'.
- `-m`: Minimum numerical frequency to include in output.
- `-n`: Maximum number of items to return in output.
- `-o`: Output to JSON file in addition to stdout.
- `-p`: Change parsing mode for URL input. [0 = Strict, 1 = Permissive, 2 = Maximum].
- `-r`: Only keep items not in a file.
- `-rm`: Replacement mask for transformations if applicable. (default "uldsbt")
- `-t`: Transformation to apply to input.
- `-tf`: Read additional files for transformations if applicable.
- `-tp`: Read a template file for multiple transformations and operations.
- `-u`: Read additional URLs for input.
- `-v`: Show verbose output when possible.
- `-vv`: Show statistics output when possible.
- `-vvv`: Show verbose statistics output when possible.
```
-b Bypass map creation and use stdout as primary output.
-d int
Enable debug mode with verbosity levels [0-2].
-f value
Read additional files for input.
-i value
Starting index for transformations if applicable. Accepts ranges separated by '-'.
-k value
Only keep items in a file.
-l value
Only output items of a certain length (does not adjust for rules). Accepts ranges separated by '-'.
-m int
Minimum numerical frequency to include in output.
-n int
Maximum number of items to return in output.
-o string
Output to JSON file in addition to stdout.
-p int
Change parsing mode for URL input. [0 = Strict, 1 = Permissive, 2 = Maximum] [0-2].
-r value
Only keep items not in a file.
-rm string
Replacement mask for transformations if applicable. (default "uldsbt")
-t string
Transformation to apply to input.
-tf value
Read additional files for transformations if applicable.
-tp value
Read a template file for multiple transformations and operations.
-u value
Read additional URLs for input.
-v Show verbose output when possible.
-vv
Show statistics output when possible.
-vvv
Show verbose statistics output when possible.
-w int
Number of words to use for a transformation if applicable.
```

#### Transformations:
The following transformations can be used with the `-t` flag:
Expand All @@ -157,6 +179,8 @@ The following transformations can be used with the `-t` flag:
Transforms input by swapping tokens from a partial mask file and a input file.
-t passphrase -w [words] -tf [file]
Transforms input by randomly generating passphrases with a given number of words and separators from a file.
-t regram -w [words]
Transforms input by 'regramming' sentences into new n-grams with a given number of words.
-t replace-all -tf [file]
Transforms input by replacing all strings with all matches from a ':' separated file.
-t rule-append
Expand Down Expand Up @@ -650,3 +674,14 @@ changed to the length of the input.
This transformation can be used to extract specific parts of the input for
further processing.

### Regram
This mode allows 'regramming' sentences into new n-grams with a given number of words. The syntax is as follows:
```
ptt -f <input_file> -t regram -w <word_count>
```
The `regram` transformation will generate new n-grams from the input by
combining words from the input. The number of words to use in the n-gram is
specified by the `-w` flag. The output will be the new n-grams generated from
the input.


5 changes: 3 additions & 2 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ import (
"github.com/jakewnuk/ptt/pkg/utils"
)

var version = "0.3.2"
var version = "0.3.3"
var wg sync.WaitGroup
var mutex = &sync.Mutex{}
var retain models.FileArgumentFlag
Expand Down Expand Up @@ -65,6 +65,7 @@ func main() {
"passphrase -w [words] -tf [file]": "Transforms input by randomly generating passphrases with a given number of words and separators from a file.",
"substring -i [index]": "Transforms input by extracting substrings starting at index and ending at index.",
"replace-all -tf [file]": "Transforms input by replacing all strings with all matches from a ':' separated file.",
"regram -w [words]": "Transforms input by 'regramming' sentences into new n-grams with a given number of words.",
}

// Sort and print transformation modes
Expand Down Expand Up @@ -93,7 +94,7 @@ func main() {
bypassMap := flag.Bool("b", false, "Bypass map creation and use stdout as primary output.")
debugMode := flag.Int("d", 0, "Enable debug mode with verbosity levels [0-2].")
URLParsingMode := flag.Int("p", 0, "Change parsing mode for URL input. [0 = Strict, 1 = Permissive, 2 = Maximum] [0-2].")
passPhraseWords := flag.Int("w", 0, "Number of words to generate for passphrases if applicable.")
passPhraseWords := flag.Int("w", 0, "Number of words to use for a transformation if applicable.")
flag.Var(&retain, "k", "Only keep items in a file.")
flag.Var(&remove, "r", "Only keep items not in a file.")
flag.Var(&readFiles, "f", "Read additional files for input.")
Expand Down
42 changes: 42 additions & 0 deletions pkg/models/models.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package models

import (
"fmt"
"io"
"os"
"strings"
)
Expand Down Expand Up @@ -97,13 +98,49 @@ func (p PairList) Swap(i, j int) { p[i], p[j] = p[j], p[i] }
// or from a mock file system for testing
type FileSystem interface {
ReadFile(filename string) ([]byte, error)
Open(filename string) (File, error)
}

// File is an interface that represents a file
type File interface {
Read(p []byte) (n int, err error)
Close() error
}

// MockFileSystem is used to read files from the mock file system
type MockFileSystem struct {
Files map[string][]byte
}

// MockFile represents a mock file
type MockFile struct {
Data []byte
Offset int64
}

// Read reads data from the mock file
func (m *MockFile) Read(p []byte) (n int, err error) {
if m.Offset >= int64(len(m.Data)) {
return 0, io.EOF
}
n = copy(p, m.Data[m.Offset:])
m.Offset += int64(n)
return n, nil
}

// Close closes the mock file (no-op for mock)
func (m *MockFile) Close() error {
return nil
}

// Open opens a mock file and returns a File interface
func (m *MockFileSystem) Open(filename string) (File, error) {
if data, ok := m.Files[filename]; ok {
return &MockFile{Data: data}, nil
}
return nil, fmt.Errorf("file not found: %s", filename)
}

// ReadFile Implements the ReadFile method of the FileSystem interface for the MockFileSystem
func (m *MockFileSystem) ReadFile(filename string) ([]byte, error) {
if data, ok := m.Files[filename]; ok {
Expand All @@ -120,6 +157,11 @@ func (r *RealFileSystem) ReadFile(filename string) ([]byte, error) {
return os.ReadFile(filename)
}

// Open opens a file and returns a File interface
func (fs RealFileSystem) Open(filename string) (File, error) {
return os.Open(filename)
}

// Scanner is an interface that is used to read lines from a file
type Scanner interface {
Scan() bool
Expand Down
4 changes: 2 additions & 2 deletions pkg/rule/rule.go
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ func FormatCharToRuleOutput(strs ...string) (output string) {
output = output[:len(output)-1] + ":"
}

if output != "" && len(output) <= 93 {
if output != "" && len(output) < 93 {
return strings.TrimSpace(output)
}

Expand Down Expand Up @@ -157,7 +157,7 @@ func FormatCharToIteratingRuleOutput(index int, strs ...string) (output string)
}
}

if output != "" && len(output) <= 93 {
if output != "" && len(output) < 93 {
return strings.TrimSpace(output)
}

Expand Down
52 changes: 52 additions & 0 deletions pkg/transform/transform.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import (
"fmt"
"math/rand"
"os"
"strings"

"github.com/jakewnuk/ptt/pkg/format"
"github.com/jakewnuk/ptt/pkg/mask"
Expand Down Expand Up @@ -129,6 +130,12 @@ func TransformationController(input map[string]int, mode string, startingIndex i
os.Exit(1)
}
output = ReplaceAllKeysInMap(input, transformationFilesMap, bypass, functionDebug)
case "regram":
if passphraseWords == 0 {
fmt.Fprintf(os.Stderr, "[!] Regram operations require use of the -w flag to specify the number of words to use\n")
os.Exit(1)
}
output = GenerateNGramMap(input, passphraseWords, bypass, functionDebug)
default:
output = input
}
Expand Down Expand Up @@ -308,3 +315,48 @@ func GeneratePassphrase(passWords map[string]int, transformationFilesMap map[str

return newKeyPhrase
}

// GenerateNGramMap takes a map of keys and values and generates a new map
// using the utils.GenerateNGrams function and combines the results. This
// function is used to generate n-grams from the input map for the regram
// transformation mode.
//
// Args:
//
// input (map[string]int): The original map to generate n-grams from
// ngramSize (int): The size of the n-grams to generate
// bypass (bool): If true, the map is not used for output or filtering
// debug (bool): If true, print additional debug information to stderr
//
// Returns:
//
// (map[string]int): A new map with the n-grams generated
func GenerateNGramMap(input map[string]int, ngramSize int, bypass bool, debug bool) map[string]int {
newMap := make(map[string]int)
for key, value := range input {
newKeyArray := utils.GenerateNGrams(key, ngramSize)
for _, newKey := range newKeyArray {

if debug {
fmt.Fprintf(os.Stderr, "Key: %s\n", key)
fmt.Fprintf(os.Stderr, "New Key: %s\n", newKey)
}

newKey = strings.TrimSpace(newKey)
newKey = strings.TrimLeft(newKey, ",")
newKey = strings.TrimRight(newKey, ",")
newKey = strings.TrimLeft(newKey, " ")

if !bypass {
if newMap[newKey] == 0 {
newMap[newKey] = value
} else {
newMap[newKey] += value
}
} else {
fmt.Println(newKey)
}
}
}
return newMap
}
41 changes: 31 additions & 10 deletions pkg/utils/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ import (
// (map[string]int): A map of words from the files
func ReadFilesToMap(fs models.FileSystem, filenames []string) map[string]int {
wordMap := make(map[string]int)
// 4 GB read buffer
chunkSize := int64(4 * 1024 * 1024 * 1024)

i := 0
for i < len(filenames) {
Expand All @@ -52,21 +54,40 @@ func ReadFilesToMap(fs models.FileSystem, filenames []string) map[string]int {
}
filenames = append(filenames, files...)
} else {
data, err := fs.ReadFile(filename)
file, err := fs.Open(filename)
if err != nil {
fmt.Fprintf(os.Stderr, "[!] Error reading file %s\n", filename)
fmt.Fprintf(os.Stderr, "[!] Error opening file %s\n", filename)
os.Exit(1)
}
defer file.Close()

buffer := make([]byte, chunkSize)
for {
bytesRead, err := file.Read(buffer)
if err != nil && err != io.EOF {
fmt.Fprintf(os.Stderr, "[!] Error reading file %s\n", filename)
os.Exit(1)
}
if bytesRead == 0 {
break
}

err = json.Unmarshal(data, &wordMap)
if err == nil {
fmt.Fprintf(os.Stderr, "[*] Detected ptt JSON output. Importing...\n")
continue
}
data := buffer[:bytesRead]

fileWords := strings.Split(string(data), "\n")
for _, word := range fileWords {
wordMap[word]++
err = json.Unmarshal(data, &wordMap)
if err == nil {
fmt.Fprintf(os.Stderr, "[*] Detected ptt JSON output. Importing...\n")
continue
}

fileWords := strings.Split(string(data), "\n")
for _, word := range fileWords {
wordMap[word]++
}

if err == io.EOF {
break
}
}
}
i++
Expand Down

0 comments on commit 02c0eba

Please sign in to comment.