Fortran Tokenizer

This package provides a basic tokenizer and it allow users to easily customize it's behavior.

Use in your project with fpm

Include this in your fpm.toml under the dependencies section:

[ dependencies ]
fortran-tokenizer = { git = "https://github.com/14NGiestas/fortran-tokenizer.git" }

Basic Usage

By default, a valid token do not contain spaces

!file: test/default.f90
program main
    use fortran_tokenizer
    implicit none
    type(tokenizer_t) :: tokenizer
    type(token_list) :: tokens

    ! Perform the tokenization
    tokens = tokenizer % tokenize("hello world")
    print*, tokens ! ['hello', 'world']
end program

Tokenize by custom function

You can override the default behavior by defining a function that receives a character(*), intent(in) :: token and returns .true. if the string received by the function is a valid token, .false otherwise.

!file: test/custom_function.f90
program main
    use fortran_tokenizer
    implicit none

    character, parameter :: TABS = char(9)
    character, parameter :: IDENT = repeat(TABS,4)
    character(*), parameter :: string = TABS//"hello "//IDENT//"world"

    type(tokenizer_t) :: tokenizer
    type(token_list) :: tokens

    ! Perform the tokenization
    tokens = tokenizer % tokenize(string, validate=by_whitespace)

    ! Show the resultant tokens
    print "('string: ""',A,'""')", string
    print*, tokens ! ['hello', 'world']

contains

    logical function by_whitespace(token) result(is_valid)
        !! The token will be valid if it do not contain " " or TABS
        character(*), intent(in) :: token
        is_valid = .not. scan(token, " "//TABS) > 0
    end function

end program

you can also pass a function with the same interface that once a token is parsed it can be ignored completely

!file: test/custom_ignore.f90
program main
    use fortran_tokenizer
    implicit none

    character(*), parameter :: string = "hello | | duck | world duck"

    type(tokenizer_t) :: tokenizer
    type(token_list) :: tokens

    ! Perform the tokenization
    tokens = tokenizer % tokenize(string, validate=by_pipe, ignore=if_is_duck)

    ! Show the resultant tokens
    print "('string: ""',A,'""')", string
    print*, tokens ! ['hello', 'world']

contains

    logical function by_pipe(token) result(is_valid)
        !! Return .false. if token is a pipe ("|")
        character(*), intent(in) :: token
        is_valid = .not. scan(token, "| ") > 0
    end function

    logical function if_is_duck(token) result(is_valid)
        !! Return .true. if token is a duck
        character(*), intent(in) :: token
        is_valid = token == "duck"
    end function

end program

Bind a global rule

To reuse the same tokenizer rules you can bind global functions

!file: test/bind_global_function.f90
program main
    use fortran_tokenizer
    implicit none
    type(tokenizer_t) :: tokenizer
    type(token_list) :: tokens

    tokenizer % validator => global_behavior

    ! Perform the tokenization
    tokens = tokenizer % tokenize("hello| world")
    print*, tokens ! ['hello', 'world']

    ! Perform another tokenization
    tokens = tokenizer % tokenize("hello |world")
    print*, tokens ! ['hello', 'world']

    ! Perform the tokenization, using a local behaviour
    tokens = tokenizer % tokenize("hello| |world|token with spaces|", validate=local_behavior)

    ! Show the resultant tokens
    print*, tokens ! ['hello', ' ', 'world', 'token with spaces']

    ! Bind global ignore behavior
    tokenizer % ignore => ignore_tokens

    ! Perform the tokenization, using a local behaviour, but any token containg a space will ignored
    tokens = tokenizer % tokenize("hello| |world|tokens with spaces|", validate=local_behavior)

    ! Show the resultant tokens
    print*, tokens ! ['hello', 'world']

contains

    logical function global_behavior(token) result(is_valid)
        !! The token will be valid if it do not contain " " or "|"
        character(*), intent(in) :: token
        is_valid = .not. scan(token, "| ") > 0
    end function

    logical function local_behavior(token) result(is_valid)
        !! The token will be valid if it do not contain "|"
        character(*), intent(in) :: token
        is_valid = .not. scan(token, "|") > 0
    end function

    logical function ignore_tokens(token) result(is_valid)
        !! The token will be ignored if contains any spaces
        character(*), intent(in) :: token
        is_valid = scan(token, " ") > 0
    end function

end program

Contributing

Getting the source

First get the code, by cloning the repo:

git clone https://github.com/14NGiestas/fortran-tokenizer.git
cd fortran-tokenizer

Building with FPM

This project was designed to be built using the Fortran Package Manager. Follow the directions on that page to install FPM if you haven't already.

To build and run the program (provided as a example), type:

fpm run

to run the tests, type

fpm test

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
app		app
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
fpm.toml		fpm.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fortran Tokenizer

Use in your project with fpm

Basic Usage

By default, a valid token do not contain spaces

Tokenize by custom function

Bind a global rule

Contributing

Getting the source

Building with FPM

About

Releases

Packages

Languages

License

14NGiestas/fortran-tokenizer

Folders and files

Latest commit

History

Repository files navigation

Fortran Tokenizer

Use in your project with fpm

Basic Usage

By default, a valid token do not contain spaces

Tokenize by custom function

Bind a global rule

Contributing

Getting the source

Building with FPM

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages