Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang

### Is there an existing issue for this?

- [X] I have searched the existing issues

### Description

The default connector for trino is jdbc as there is no sqlalchemy support as per my knowledge from sqlalchemy community.
When running hue with multiple clients, the jdbc class calls the py4j server, which still exist it does not close after we get the results, this leads to memory usage and hence the queries get slower as time passes and after certain time it still hangs . eg 

>hue      42923 27.1  1.2 2145458 191580 ?      Sl   07:28   0:10 java -classpath /usr/lib/hue/build/env/lib/python2.7/site-packages/py4j-0.9-py2.7.egg/py4j/../share/py4j/py4j0.9.jar:/usr/lib/trino/trino-j
hue      32133 17.3  1.2 2534676 321546 ?      Sl   07:30   0:10 java -classpath /usr/lib/hue/build/env/lib/python2.7/site-packages/py4j-0.9-py2.7.egg/py4j/../share/py4j/py4j0.9.jar:/usr/lib/trino/trino-j
hue      21692 17.2  1.2 6928376 216781 ?      Sl   07:32   0:10 java -classpath /usr/lib/hue/build/env/lib/python2.7/site-packages/py4j-0.9-py2.7.egg/py4j/../share/py4j/py4j0.9.jar:/usr/lib/trino/trino-j

For executing the trino query - code flow goes like this -

>The code flow goes like this in short -

---

```

1. There is a api call (notebook/api/execute/{dialect}  -  here is dialect -> trino
2. This call goes through function - get_intepreter() , the interpreter returns a class - notebook.connectors.jdbc.JdbcApi  lass from the file -  /usr/lib/hue/desktop/libs/notebook/src/notebook/connectors/jdbc.py
3. This class will run execute , which calls other class from the file  - desktop/libs/librdbms/src/librdbms/jdbc.py
4. This class uses py4j as wrapper using jdbc connector to run the queries .

```

I have added various debugging points ,  to check where the bottleneck is 

the bottle neck is this point -

```
def query_and_fetch(db, statement, n=None):
  data = None
  try:
    db.connect()
    curs = db.cursor()

    try:
      if curs.execute(statement):
        data = curs.fetchmany(n)
      meta = curs.description
      return data, meta
    finally:
      curs.close()
  except Exception as e:
    message = force_unicode(smart_str(e))
    if 'Access denied' in message:
      raise AuthenticationRequired()
    raise
  finally:
    db.close()
```

-----

**data = curs.fetchmany(n)** this line is the bottleneck usually.


The issue should be solved if add the following lines in the close fxn of this file in jdbc class -
desktop/libs/librdbms/src/librdbms/jdbc.py

```
  def close(self):
    if self.conn is not None:
      self.conn.close()
      self.conn = None

#lines to be added
    try:
        self.gateway.shutdown()
    except Exception as e:
        LOG.error(e)
```

After adding these lines the child process of py4j gets killed. Verified by the ps auxxx and pstree command.

This can be a good first issue to solve to raise pr . Nowadays work on Hadoop so not that band width to raise here . Just trying to contribute to oss. This flow is also same for presto.

@Harshg999 @bjornalm 

Regards
Vinay Devadiga


### Steps To Reproduce

As stated in descritption use trino with hue , create multiple hue clients and fire huge trino queries . In some time, the py4j servers will take the memory , hence queries get hangs.

### Logs

Attached above.

### Hue version

Open Source 4.10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang #3223

Is there an existing issue for this?

Description

Steps To Reproduce

Logs

Hue version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang #3223

Description

Is there an existing issue for this?

Description

Steps To Reproduce

Logs

Hue version

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions