DocumentAI Expense Bot

Pre requisites

you should have a gloud user credentials file generated after performing authentication using gloud auth login, it should be located at ~/.config/gcloud/application_default_credentials.json
You either should have mongodb installed or docker if you want to use mongo driver to store documents if not you can use memory database which can be defined from environment variables
.env file with data, see example in .env.example file

Usage

make install # installs the dependencies defined in go.mod

make mongo # creates a mongodb docker container (obviously needs docker to be installed)

make env # creates a .env file based on .env.example

make test # you've guessed it, tests *_test.go files

make run # runs the project

Env example:

DB_TYPE=mongo
STORE_TYPE=local
PROCESSOR_TYPE=google
STORE_DIRECTORY_NAME=fileStorage
SERVER_PORT=:8080
MONGO_URI=mongodb://myuser:mypass@localhost:27017
MONGO_DB_NAME=expenses_main
MONGO_COLLECTION=receipts
GCP_LOCATION=
GCP_PROJECT_ID=
GCP_PROCESSOR_ID=

Exposed endpoints:

POST /documents - uploads a document (body -> key: "receipt", value: file)
GET /documents/<UUID> - gets existing document data

How can this be improved ?

To make this project ready to be deployed in a production environment it needs some adjustments and obviously more test coverage. In prod environment usually we would have a defined storage such as S3, NFS or something among those lines, perhaps even both. I think it would make sense to implement this logic using pub sub mechanism, for example we can upload documents to the service which then would publish an event to NATS and then the second service that listens to the upload events would replicate this data to different storages and also trigger scanning service so that the json is also stored along with the original documents.

It would be also nice to have prometheus metrics for those microservices after they are split so that we keep track of metrics and use tools then to visualize those metrics and monitor them.

Perhaps it is a good idea to also write a different transport layers if we talk about the microservices, we could have gRPC communication between some internal microservices and it makes sense to have both HTTP and gRPC in place so that it could be used by all services that implement those transport layers

Notes

For sake of this task I went with the simple flat file structure which I think is fine but its purely subjective. I've used interfaces almost everywhere so that if we would have multiple implementations of the logic but with different drivers,configs or etc.

Regarding the unit tests, I did not spend too much time to have 100% coverage becuase this task is for demonstration purposes.

Logging is another thing to point out, currently I am using the logger provided by echo but would be better to have slog or uber-go/zap for structurized logging

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
bot		bot
config		config
db		db
handlers		handlers
server		server
service		service
store		store
types		types
util		util
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocumentAI Expense Bot

Pre requisites

Usage

Env example:

Exposed endpoints:

How can this be improved ?

Notes

Author: Davit.K

About

Releases

Packages

Languages

License

Koshkaj/expensebot

Folders and files

Latest commit

History

Repository files navigation

DocumentAI Expense Bot

Pre requisites

Usage

Env example:

Exposed endpoints:

How can this be improved ?

Notes

Author: Davit.K

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages