Releases: pathwaycom/pathway
Releases · pathwaycom/pathway
v0.15.3
v0.15.2
Added
pw.io.deltalake.read
now supports custom S3 Delta Lakes with HTTP endpoints.pw.io.deltalake.read
now supports specifying both a custom endpoint and a custom region for Delta Lakes viapw.io.s3.AwsS3Settings
.
Changed
- Indices in
pathway.stdlib.indexing.nearest_neighbors
can now work also on numpy arrays. Previously they only acceptedlist[float]
. Working with numpy arrays improves memory efficiency. pw.io.s3.read
has been optimized to minimize new object requests whenever possible.- It is now possible to set the size limit of cache in
pw.udfs.DiskCache
. - State persistence now uses a single backend for both metadata and stream storage. The
pw.persistence.Config.simple_config
method is therefore deprecated. Now you can use thepw.persistence.Config
constructor with the same parameters that were previously used insimple_config
.
Fixed
pw.io.bigquery.write
connector now correctly handlespw.Json
columns.
v0.15.1
Fixed
pw.temporal.session
andpw.temporal.asof_join
now correctly works with multiple entries with the same time.- Fixed an issue in
pw.stdlib.indexing
where filters would cause runtime errors while usingHybridIndexFactory
.
v0.15.0
Added
- Experimental A
pw.xpacks.llm.document_store.DocumentStore
to process and index documents. pw.xpacks.llm.servers.DocumentStoreServer
used to expose REST server for retrieving documents frompw.xpacks.llm.document_store.DocumentStore
.pw.xpacks.stdlib.indexing.HybridIndex
used for querying multiple indices and combining their results.pw.io.airbyte.read
now also supports streams that only operate infull_refresh
mode.
Changed
- Running servers for answering queries is extracted from
pw.xpacks.llm.question_answering.BaseRAGQuestionAnswerer
intopw.xpacks.llm.servers.QARestServer
andpw.xpacks.llm.servers.QASummaryRestServer
. - BREAKING:
query
andquery_as_of_now
ofpathway.stdlib.indexing.data_index.DataIndex
now produce an empty list instead ofNone
if no match is found
v0.14.3
Fixed
pw.io.deltalake.read
andpw.io.deltalake.write
now correctly work with lakes hosted in S3 over min.io, Wasabi and Digital Ocean.
Added
- The Pathway CLI command
spawn
can now execute code directly from a specified GitHub repository. - A new CLI command,
spawn-from-env
, has been added. This command runs the Pathway CLIspawn
command using arguments provided in thePATHWAY_SPAWN_ARGS
environment variable.
v0.14.2
Fixed
- Switched
pw.xpacks.llm.embedders.GeminiEmbedder
to be sync to resolve compatibility issues with the Google Colab runs. - Pinned
surya-ocr
module version for stability.
v0.14.1
Added
pw.xpacks.llm.embedders.GeminiEmbedder
which is a wrapper for Google Gemini Embedding services.
v0.14.0
Fixed
pw.debug.table_to_pandas
now exportsint | None
columns correctly.
Changed
pw.io.airbyte.read
can now be used with Airbyte connectors implemented in Python without requiring Docker.- BREAKING: UDFs now verify the type of returned values at runtime. If it is possible to cast a returned value to a proper type, the values is cast. If the value does not match the expected type and can't be cast, an error is raised.
- BREAKING:
pw.reducers.ndarray
reducer requires input column to either have typefloat
,int
orArray
. pw.xpacks.llm.parsers.OpenParse
can now extract and parse images & diagrams from PDFs. This can be enabled by setting theparse_images
.processing_pipeline
can be also set to customize the post processing of doc elements.
v0.13.2
Added
pw.io.deltalake.read
now supports S3 data sources.pw.xpacks.llm.parsers.ImageParser
which allows parsing images with the vision LMs.pw.xpacks.llm.parsers.SlideParser
that enables parsing PDF and PPTX slides with the vision LMs.pw.xpacks.llm.parsers.question_answering.RAGClient
, Python client for Pathway hosted RAG apps.pw.xpacks.llm.parsers.question_answeringDeckRetriever
, a RAG app that enables searching through slide decks with visual-heavy elements.
Fixed
pw.xpacks.llm.vector_store.VectorStoreServer
now uses new indexes.
Changed
pw.xpacks.llm.parsers.OpenParse
now supports any vision Language model including local and propriety models via LiteLLM.
v0.13.1
Added
pw.io.kafka.read
now accepts an autogenerate_key flag. This flag determines the primary key generation policy to apply when reading raw data from the source. You can either use the key from the Kafka message or have Pathway autogenerate one.pw.io.deltalake.read
input connector that fetches changes from DeltaLake into a Pathway table.pw.xpacks.llm.parsers.OpenParse
which allows parsing tables and images in PDFs.
Fixed
- All S3 input connectors (including S3, Min.io, Digital Ocean, and Wasabi) now automatically retry network operations if a failure occurs.
- The issue where the connection to the S3 source fails after partially ingesting an object has been resolved by downloading the object in full first.