Step 3 is still running.. more than 24 hours #5

viniciustesoni · 2021-04-21T15:00:52Z

Hi,
I'm having some issues at the step 3.

I have followed all the steps and configurations on the Readme.

But I runned the Step 3 more than 24 hours ago and I still having the message "loading word2vec vectors...".
I'm running at SageMaker Studio on a machine with 4 vCPU and 16 GB of memory.

Is that correct? Or did I something wrong?
I runned the Step 3 at another terminal that's because the Step 2 stopped at this screen.

I really appreciate any suggestion.

amirmohammadkz · 2021-04-22T06:58:30Z

Hello,
Have you started the bert-as-service? You SHOULD run the bert server in the background and then you can run the encoder (step 3). If the code stuck, it means that it is waiting for an alive bert server (which probably you do not have).
When the bert server is alive, step 3 exactly estimates the time you are running the code and the remaining time for the whole process.

viniciustesoni · 2021-04-22T17:01:09Z

@amirmohammadkz Yes. I did it.
When I run the step 2 bert-serving-start -model_dir uncased_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=NONE -show_tokens_to_client -pooling_layer -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 to start bert-as-service it show me this message:

`(vj2) bash-4.2$ bert-serving-start -model_dir uncased_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=NONE -show_tokens_to_client -pooling_layer -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
/opt/conda/envs/vj2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/opt/conda/envs/vj2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/opt/conda/envs/vj2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/opt/conda/envs/vj2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/opt/conda/envs/vj2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/opt/conda/envs/vj2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
usage: /opt/conda/envs/vj2/bin/bert-serving-start -model_dir uncased_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=NONE -show_tokens_to_client -pooling_layer -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
ARG VALUE

       ckpt_name = bert_model.ckpt
     config_name = bert_config.json
            cors = *
             cpu = False
      device_map = []
   do_lower_case = True

fixed_embed_length = False
fp16 = False
gpu_memory_fraction = 0.5
graph_tmp_dir = None
http_max_connect = 10
http_port = None
mask_cls_sep = False
max_batch_size = 256
max_seq_len = None
model_dir = uncased_L-12_H-768_A-12/
no_position_embeddings = False
no_special_token = False
num_worker = 4
pooling_layer = [-12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1]
pooling_strategy = REDUCE_MEAN
port = 5555
port_out = 5556
prefetch_size = 10
priority_batch_size = 16
show_tokens_to_client = True
tuned_model_dir = None
verbose = False
xla = False

I:VENTILATOR:[__i:__i: 67]:freeze, optimize and export graph, could take a while...
I:GRAPHOPT:[gra:opt: 53]:model config: uncased_L-12_H-768_A-12/bert_config.json
I:GRAPHOPT:[gra:opt: 56]:checkpoint: uncased_L-12_H-768_A-12/bert_model.ckpt
I:GRAPHOPT:[gra:opt: 60]:build graph...

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

I:GRAPHOPT:[gra:opt:132]:load parameters from checkpoint...
I:GRAPHOPT:[gra:opt:136]:optimize...
I:GRAPHOPT:[gra:opt:144]:freeze...
I:GRAPHOPT:[gra:opt:149]:write graph to a tmp file: /tmp/tmpvb1uc7_b
I:VENTILATOR:[__i:__i: 75]:optimized graph is stored at: /tmp/tmpvb1uc7_b
I:VENTILATOR:[__i:_ru:129]:bind all sockets
I:VENTILATOR:[__i:_ru:133]:open 8 ventilator-worker sockets
I:VENTILATOR:[__i:_ru:136]:start the sink
I:SINK:[__i:_ru:306]:ready
I:VENTILATOR:[__i:_ge:222]:get devices
W:VENTILATOR:[__i:_ge:246]:no GPU available, fall back to CPU
I:VENTILATOR:[__i:_ge:255]:device map:
worker 0 -> cpu
worker 1 -> cpu
worker 2 -> cpu
worker 3 -> cpu
I:WORKER-0:[__i:_ru:531]:use device cpu, load graph from /tmp/tmpvb1uc7_b
I:WORKER-2:[__i:_ru:531]:use device cpu, load graph from /tmp/tmpvb1uc7_b
I:WORKER-1:[__i:_ru:531]:use device cpu, load graph from /tmp/tmpvb1uc7_b
I:WORKER-3:[__i:_ru:531]:use device cpu, load graph from /tmp/tmpvb1uc7_b
I:WORKER-2:[__i:gen:559]:ready and listening`

Only one WORKER show as "ready and listening".
Is that correct?
At "bert-as-service" documentation page, it seens that all WORKERS are ready.

amirmohammadkz · 2021-04-23T10:03:35Z

All workers should be ready before running the code. If you face a problem in that, check the bert-as-service repository.
My suggestion is just to test the server with 1 worker. It might work for you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step 3 is still running.. more than 24 hours #5

Step 3 is still running.. more than 24 hours #5

viniciustesoni commented Apr 21, 2021

amirmohammadkz commented Apr 22, 2021

viniciustesoni commented Apr 22, 2021

amirmohammadkz commented Apr 23, 2021

Step 3 is still running.. more than 24 hours #5

Step 3 is still running.. more than 24 hours #5

Comments

viniciustesoni commented Apr 21, 2021

amirmohammadkz commented Apr 22, 2021

viniciustesoni commented Apr 22, 2021

amirmohammadkz commented Apr 23, 2021