Utility to extract the contents of a subtitle file
srt
: SubRip- other types will be supported as soon as possible :)
Installation is done using go get
.
go get -u github.com/arman-aminian/gosub
The method parse requires the following parameters:
path
: location of the subtitle file.
s, err := gosub.Parse("./Example/test.srt")
if err != nil {
panic(err)
}
file, err := os.Open("./Example/test.srt")
if err != nil {
panic(err)
}
defer file.Close()
s, err := gosub.ParseByFile(file)
if err != nil {
panic(err)
}
Each line of a subtitle is represented with a Line
struct with the following properties:
start
: timestamp of the start of the dialog.end
: timestamp of the end of the dialog.text
: dialog contents.
for _, line := range s.Lines {
fmt.Println(line.Start, " -> ", line.End)
fmt.Println(line.Text)
fmt.Println()
}
Output:
0000-01-01 00:00:00 +0000 UTC -> 0000-01-01 00:00:30.05 +0000 UTC
first line [test test] of subtitle
0000-01-01 00:01:21.062 +0000 UTC -> 0000-01-01 00:01:26.361 +0000 UTC
second line <i> test test<\i> of subtitle
0000-01-01 00:01:26.565 +0000 UTC -> 0000-01-01 00:01:32.564 +0000 UTC
third line of subtitle
third 2
0000-01-01 00:01:32.768 +0000 UTC -> 0000-01-01 00:01:37.967 +0000 UTC
Make IT lowerCase
Currently, 3 cleaners are provided:
RemoveBrackets
will remove anything between them (e.g.,[test test]
)RemoveTags
will remove formatting keys like<i>
and</i>
and also anything between them.LowerCase
will lower case all text.
Calculate wpm(word per minute):
CalculateMaxWpm
will find the Max WPM.CalculateMinWpm
will find the min WPM.CalculateMeanWpm
will find the mean WPM.TotalWordCount
will caclulate number of words in file.